README.md 9.57 KB
Newer Older
Ross Girshick's avatar
Ross Girshick committed
1
# *Fast* R-CNN
2

Ross Girshick's avatar
Ross Girshick committed
3
Created by Ross Girshick at Microsoft Research, Redmond.
Ross Girshick's avatar
Ross Girshick committed
4

Ross Girshick's avatar
Ross Girshick committed
5
6
### Introduction

Ross Girshick's avatar
Ross Girshick committed
7
**Fast R-CNN** is a fast framework for object detection with deep ConvNets. Fast R-CNN
8
9
 - trains state-of-the-art models, like VGG16, 9x faster than traditional R-CNN and 3x faster than SPPnet,
 - runs 200x faster than R-CNN and 10x faster than SPPnet at test-time,
Ross Girshick's avatar
Ross Girshick committed
10
11
 - has a significantly higher mAP on PASCAL VOC than both R-CNN and SPPnet,
 - and is written in Python and C++/Caffe.
Ross Girshick's avatar
Ross Girshick committed
12

Ross Girshick's avatar
Ross Girshick committed
13
Fast R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1504.08083).
Ross Girshick's avatar
Ross Girshick committed
14

15
16
17
18
### License

Fast R-CNN is released under the MIT License (refer to the LICENSE file for details).

Ross Girshick's avatar
Ross Girshick committed
19
20
### Citing Fast R-CNN

Ross Girshick's avatar
Ross Girshick committed
21
If you find Fast R-CNN useful in your research, please consider citing:
Ross Girshick's avatar
Ross Girshick committed
22
23
24
25

    @article{girshick15fastrcnn,
        Author = {Ross Girshick},
        Title = {Fast R-CNN},
Ross Girshick's avatar
Ross Girshick committed
26
        Journal = {arXiv preprint arXiv:1504.08083},
Ross Girshick's avatar
Ross Girshick committed
27
28
        Year = {2015}
    }
Ross Girshick's avatar
Ross Girshick committed
29
30
31
32
33
34
35
36
    
### Contents
1. [Requirements: software](#requirements-software)
2. [Requirements: hardware](#requirements-hardware)
3. [Basic installation](#installation-sufficient-for-the-demo)
4. [Demo](#demo)
5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
6. [Usage](#usage)
Ross Girshick's avatar
Ross Girshick committed
37
7. [Extra downloads](#extra-downloads)
Ross Girshick's avatar
Ross Girshick committed
38

Ross Girshick's avatar
Ross Girshick committed
39
### Requirements: software
Ross Girshick's avatar
Ross Girshick committed
40

Ross Girshick's avatar
Ross Girshick committed
41
1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))
Ross Girshick's avatar
Ross Girshick committed
42
43
44

  **Note:** Caffe *must* be built with support for Python layers!

Ross Girshick's avatar
Ross Girshick committed
45
46
47
48
  ```make
  # In your Makefile.config, make sure to have this line uncommented
  WITH_PYTHON_LAYER := 1
  ```
Ross Girshick's avatar
Ross Girshick committed
49

Ross Girshick's avatar
Ross Girshick committed
50
  You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
Ross Girshick's avatar
Ross Girshick committed
51
2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
Ross Girshick's avatar
Ross Girshick committed
52
3. [optional] MATLAB (required for PASCAL VOC evaluation only)
Ross Girshick's avatar
Ross Girshick committed
53

Ross Girshick's avatar
Ross Girshick committed
54
55
56
57
58
### Requirements: hardware

1. For training smaller networks (CaffeNet, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
2. For training with VGG16, you'll need a K40 (~11G of memory)

Ross Girshick's avatar
Ross Girshick committed
59
### Installation (sufficient for the demo)
60

Ross Girshick's avatar
Ross Girshick committed
61
62
63
1. Clone the Fast R-CNN repository
  ```Shell
  # Make sure to clone with --recursive
64
  git clone --recursive https://github.com/rbgirshick/fast-rcnn.git
Ross Girshick's avatar
Ross Girshick committed
65
66
67
  ```
  
2. We'll call the directory that you cloned Fast R-CNN into `FRCN_ROOT`
68
69
70
71
72
73
74
75
76

   *Ignore notes 1 and 2 if you followed step 1 above.*
   
   **Note 1:** If you didn't clone Fast R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
    ```Shell
    git submodule update --init --recursive
    ```
    **Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `fast-rcnn` branch (or equivalent detached state). This will happen automatically *if you follow these instructions*.

Ross Girshick's avatar
Ross Girshick committed
77
78
79
80
81
82
83
84
85
86
87
3. Build the Cython modules
    ```Shell
    cd $FRCN_ROOT/lib
    make
    ```
    
4. Build Caffe and pycaffe
    ```Shell
    cd $FRCN_ROOT/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
Ross Girshick's avatar
Ross Girshick committed
88

Ross Girshick's avatar
Ross Girshick committed
89
90
91
92
93
94
95
96
97
98
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j8 && make pycaffe
    ```
    
5. Download pre-computed Fast R-CNN detectors
    ```Shell
    cd $FRCN_ROOT
    ./data/scripts/fetch_fast_rcnn_models.sh
    ```
99

Ross Girshick's avatar
Ross Girshick committed
100
    This will populate the `$FRCN_ROOT/data` folder with `fast_rcnn_models`. See `data/README.md` for details.
101

Ross Girshick's avatar
Ross Girshick committed
102
103
### Demo

Ross Girshick's avatar
Ross Girshick committed
104
105
106
107
*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.

**Python**

Ross Girshick's avatar
Ross Girshick committed
108
109
110
111
To run the demo
```Shell
cd $FRCN_ROOT
./tools/demo.py
Ross Girshick's avatar
Ross Girshick committed
112
```
113
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007. The object proposals are pre-computed in order to reduce installation requirements.
Ross Girshick's avatar
Ross Girshick committed
114

Ross Girshick's avatar
Ross Girshick committed
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
**Note:** If the demo crashes Caffe because your GPU doesn't have enough memory, try running the demo with a small network, e.g., `./tools/demo.py --net caffenet` or with `--net vgg_cnn_m_1024`. Or run in CPU mode `./tools/demo.py --cpu`. Type `./tools/demo.py -h` for usage.

**MATLAB**

There's also a *basic* MATLAB demo, though it's missing some minor bells and whistles compared to the Python version.
```Shell
cd $FRCN_ROOT/matlab
matlab # wait for matlab to start...

# At the matlab prompt, run the script:
>> fast_rcnn_demo
```

Fast R-CNN training is implemented in Python only, but test-time detection functionality also exists in MATLAB.
See `matlab/fast_rcnn_demo.m` and `matlab/fast_rcnn_im_detect.m` for details.

Ross Girshick's avatar
Ross Girshick committed
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
### Beyond the demo: installation for training and testing models
1. Download the training, validation, test data and VOCdevkit

	```Shell
	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtest_06-Nov-2007.tar
	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
	```
	
2. Extract all of these tars into one directory named `VOCdevkit`

	```Shell
	tar xvf VOCtrainval_06-Nov-2007.tar
	tar xvf VOCtest_06-Nov-2007.tar
	tar xvf VOCdevkit_08-Jun-2007.tar
	```

3. It should have this basic structure
149

Ross Girshick's avatar
Ross Girshick committed
150
151
152
153
154
155
156
	```Shell
  	$VOCdevkit/                           # development kit
  	$VOCdevkit/VOCcode/                   # VOC utility code
  	$VOCdevkit/VOC2007                    # image sets, annotations, etc.
  	# ... and several other directories ...
  	```
  	
157
4. Create symlinks for the PASCAL VOC dataset
158

Ross Girshick's avatar
Ross Girshick committed
159
160
161
162
	```Shell
    cd $FRCN_ROOT/data
    ln -s $VOCdevkit VOCdevkit2007
    ```
163
164
165
    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
6. Follow the next sections to download pre-computed object proposals and pre-trained ImageNet models
Ross Girshick's avatar
Ross Girshick committed
166
167
168
169
170
171
172
173

### Download pre-computed Selective Search object proposals

Pre-computed selective search boxes can also be downloaded for VOC2007 and VOC2012.

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_selective_search_data.sh
Ross Girshick's avatar
Ross Girshick committed
174
```
Ross Girshick's avatar
Ross Girshick committed
175
176
177
178
179
180
181
182
183
184

This will populate the `$FRCN_ROOT/data` folder with `selective_selective_data`.

### Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: CaffeNet (model **S**), VGG_CNN_M_1024 (model **M**), and VGG16 (model **L**).

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
Ross Girshick's avatar
Ross Girshick committed
185
```
Ross Girshick's avatar
Ross Girshick committed
186
These models are all available in the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but are provided here for your convenience.
187

Ross Girshick's avatar
Ross Girshick committed
188
189
190
191
192
193
194
195
### Usage

**Train** a Fast R-CNN detector. For example, train a VGG16 network on VOC 2007 trainval:

```Shell
./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt \
	--weights data/imagenet_models/VGG16.v2.caffemodel
```
196

197
198
199
200
201
202
203
204
If you see this error

```
EnvironmentError: MATLAB command 'matlab' not found. Please add 'matlab' to your PATH.
```

then you need to make sure the `matlab` binary is in your `$PATH`. MATLAB is currently required for PASCAL VOC evaluation.

Ross Girshick's avatar
Ross Girshick committed
205
**Test** a Fast R-CNN detector. For example, test the VGG 16 network on VOC 2007 test:
206

Ross Girshick's avatar
Ross Girshick committed
207
208
209
```Shell
./tools/test_net.py --gpu 1 --def models/VGG16/test.prototxt \
	--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
210
```
Ross Girshick's avatar
Ross Girshick committed
211
212
213
214
215
216
217
218
219
220
221
222

Test output is written underneath `$FRCN_ROOT/output`.

**Compress** a Fast R-CNN model using truncated SVD on the fully-connected layers:

```Shell
./tools/compress_net.py --def models/VGG16/test.prototxt \
	--def-svd models/VGG16/compressed/test.prototxt \
    --net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
# Test the model you just compressed
./tools/test_net.py --gpu 0 --def models/VGG16/compressed/test.prototxt \
	--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000_svd_fc6_1024_fc7_256.caffemodel
223
```
Ross Girshick's avatar
Ross Girshick committed
224
225

### Experiment scripts
Ross Girshick's avatar
Ross Girshick committed
226
Scripts to reproduce the experiments in the paper (*up to stochastic variation*) are provided in `$FRCN_ROOT/experiments/scripts`. Log files for experiments are located in `experiments/logs`.
Ross Girshick's avatar
Ross Girshick committed
227

228
229
230
**Note:** Until recently (commit a566e39), the RNG seed for Caffe was not fixed during training. Now it's fixed, unless `train_net.py` is called with the `--rand` flag.
Results generated before this commit will have some stochastic variation.

Ross Girshick's avatar
Ross Girshick committed
231
232
233
234
235
236
237
238
239
240
### Extra downloads

- [Experiment logs](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/fast_rcnn_experiments.tgz)
- PASCAL VOC test set detections
    - [voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz)
    - [voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz)
    - [voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz)
    - [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz)
    - [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz)
- [Fast R-CNN VGG16 model](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc12_submission.tgz) trained on VOC07 train,val,test union with VOC12 train,val