参考 Detectron Model Zoo and Baselines - 云+社区 - 腾讯云
This file documents a large collection of baselines trained with Detectron, primarily in late December 2017. We refer to these results as the 12_2017_baselines. All configurations for these baselines are located in the configs/12_2017_baselines
directory. The tables below provide results and useful statistics about training and inference. Links to the trained models as well as their output are provided. Unless noted differently below (see "Notes" under each table), the following common settings are used for all training and inference runs.
Common Settings and Notes
coco_2014_train
and coco_2014_valminusminival
, which is exactly equivalent to the recently defined coco_2017_train
dataset.coco_2014_minival
dataset, which is exactly equivalent to the recently defined coco_2017_val
dataset..md5sum
to the URL to download the file's md5 hash.Training Schedules
We use three training schedules, indicated by the lr schd column in the tables below.
coco_2014_train
union coco_2014_valminusminival
(or equivalently, coco_2017_train
).All training schedules also use a 500 iteration linear learning rate warm up. When changing the minibatch size between 8 and 16 images, we adjust the number of SGD iterations and the base learning rate according to the principles outlined in our paper Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour.
License
All models available for download through this document are licensed under the Creative Commons Attribution-ShareAlike 3.0 license.
ImageNet Pretrained Models
The backbone models pretrained on ImageNet are available in the format used by Detectron. Unless otherwise noted, these models are trained on the standard ImageNet-1k dataset.
Log Files
Training and inference logs are available for most models in the model zoo.
backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-C4 | RPN | 1x | 2 | 4.3 | 0.187 | 4.7 | 0.113 | - | - | - | 51.6 | 35998355 | model | props: 1, 2, 3 |
R-50-FPN | RPN | 1x | 2 | 6.4 | 0.416 | 10.4 | 0.080 | - | - | - | 57.2 | 35998814 | model | props: 1, 2, 3 |
R-101-FPN | RPN | 1x | 2 | 8.1 | 0.503 | 12.6 | 0.108 | - | - | - | 58.2 | 35998887 | model | props: 1, 2, 3 |
X-101-64x4d-FPN | RPN | 1x | 2 | 11.5 | 1.395 | 34.9 | 0.292 | - | - | - | 59.4 | 35998956 | model | props: 1, 2, 3 |
X-101-32x8d-FPN | RPN | 1x | 2 | 11.6 | 1.102 | 27.6 | 0.222 | - | - | - | 59.5 | 36760102 | model | props: 1, 2, 3 |
Notes:
coco_2014_train
; "2" is coco_2014_valminusminival
; and "3" is coco_2014_minival
.backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-C4 | Fast | 1x | 1 | 6.0 | 0.456 | 22.8 | 0.241 + 0.003 | 34.4 | - | - | - | 36224013 | model | boxes |
R-50-C4 | Fast | 2x | 1 | 6.0 | 0.453 | 45.3 | 0.241 + 0.003 | 35.6 | - | - | - | 36224046 | model | boxes |
R-50-FPN | Fast | 1x | 2 | 6.0 | 0.285 | 7.1 | 0.076 + 0.004 | 36.4 | - | - | - | 36225147 | model | boxes |
R-50-FPN | Fast | 2x | 2 | 6.0 | 0.287 | 14.4 | 0.077 + 0.004 | 36.8 | - | - | - | 36225249 | model | boxes |
R-101-FPN | Fast | 1x | 2 | 7.7 | 0.448 | 11.2 | 0.102 + 0.003 | 38.5 | - | - | - | 36228880 | model | boxes |
R-101-FPN | Fast | 2x | 2 | 7.7 | 0.449 | 22.5 | 0.103 + 0.004 | 39.0 | - | - | - | 36228933 | model | boxes |
X-101-64x4d-FPN | Fast | 1x | 1 | 6.3 | 0.994 | 49.7 | 0.292 + 0.003 | 40.4 | - | - | - | 36226250 | model | boxes |
X-101-64x4d-FPN | Fast | 2x | 1 | 6.3 | 0.980 | 98.0 | 0.291 + 0.003 | 39.8 | - | - | - | 36226326 | model | boxes |
X-101-32x8d-FPN | Fast | 1x | 1 | 6.4 | 0.721 | 36.1 | 0.217 + 0.003 | 40.6 | - | - | - | 37119777 | model | boxes |
X-101-32x8d-FPN | Fast | 2x | 1 | 6.4 | 0.720 | 72.0 | 0.217 + 0.003 | 39.7 | - | - | - | 37121469 | model | boxes |
R-50-C4 | Mask | 1x | 1 | 6.4 | 0.466 | 23.3 | 0.252 + 0.020 | 35.5 | 31.3 | - | - | 36224121 | model | boxes | masks |
R-50-C4 | Mask | 2x | 1 | 6.4 | 0.464 | 46.4 | 0.253 + 0.019 | 36.9 | 32.5 | - | - | 36224151 | model | boxes | masks |
R-50-FPN | Mask | 1x | 2 | 7.9 | 0.377 | 9.4 | 0.082 + 0.019 | 37.3 | 33.7 | - | - | 36225401 | model | boxes | masks |
R-50-FPN | Mask | 2x | 2 | 7.9 | 0.377 | 18.9 | 0.083 + 0.018 | 37.7 | 34.0 | - | - | 36225732 | model | boxes | masks |
R-101-FPN | Mask | 1x | 2 | 9.6 | 0.539 | 13.5 | 0.111 + 0.018 | 39.4 | 35.6 | - | - | 36229407 | model | boxes | masks |
R-101-FPN | Mask | 2x | 2 | 9.6 | 0.537 | 26.9 | 0.109 + 0.016 | 40.0 | 35.9 | - | - | 36229740 | model | boxes | masks |
X-101-64x4d-FPN | Mask | 1x | 1 | 7.3 | 1.036 | 51.8 | 0.292 + 0.016 | 41.3 | 37.0 | - | - | 36226382 | model | boxes | masks |
X-101-64x4d-FPN | Mask | 2x | 1 | 7.3 | 1.035 | 103.5 | 0.292 + 0.014 | 41.1 | 36.6 | - | - | 36672114 | model | boxes | masks |
X-101-32x8d-FPN | Mask | 1x | 1 | 7.4 | 0.766 | 38.3 | 0.223 + 0.017 | 41.3 | 37.0 | - | - | 37121516 | model | boxes | masks |
X-101-32x8d-FPN | Mask | 2x | 1 | 7.4 | 0.765 | 76.5 | 0.222 + 0.014 | 40.7 | 36.3 | - | - | 37121596 | model | boxes | masks |
Notes:
backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-C4 | Faster | 1x | 1 | 6.3 | 0.566 | 28.3 | 0.167 + 0.003 | 34.8 | - | - | - | 35857197 | model | boxes |
R-50-C4 | Faster | 2x | 1 | 6.3 | 0.569 | 56.9 | 0.174 + 0.003 | 36.5 | - | - | - | 35857281 | model | boxes |
R-50-FPN | Faster | 1x | 2 | 7.2 | 0.544 | 13.6 | 0.093 + 0.004 | 36.7 | - | - | - | 35857345 | model | boxes |
R-50-FPN | Faster | 2x | 2 | 7.2 | 0.546 | 27.3 | 0.092 + 0.004 | 37.9 | - | - | - | 35857389 | model | boxes |
R-101-FPN | Faster | 1x | 2 | 8.9 | 0.647 | 16.2 | 0.120 + 0.004 | 39.4 | - | - | - | 35857890 | model | boxes |
R-101-FPN | Faster | 2x | 2 | 8.9 | 0.647 | 32.4 | 0.119 + 0.004 | 39.8 | - | - | - | 35857952 | model | boxes |
X-101-64x4d-FPN | Faster | 1x | 1 | 6.9 | 1.057 | 52.9 | 0.305 + 0.003 | 41.5 | - | - | - | 35858015 | model | boxes |
X-101-64x4d-FPN | Faster | 2x | 1 | 6.9 | 1.055 | 105.5 | 0.304 + 0.003 | 40.8 | - | - | - | 35858198 | model | boxes |
X-101-32x8d-FPN | Faster | 1x | 1 | 7.0 | 0.799 | 40.0 | 0.233 + 0.004 | 41.3 | - | - | - | 36761737 | model | boxes |
X-101-32x8d-FPN | Faster | 2x | 1 | 7.0 | 0.800 | 80.0 | 0.233 + 0.003 | 40.6 | - | - | - | 36761786 | model | boxes |
R-50-C4 | Mask | 1x | 1 | 6.6 | 0.620 | 31.0 | 0.181 + 0.018 | 35.8 | 31.4 | - | - | 35858791 | model | boxes | masks |
R-50-C4 | Mask | 2x | 1 | 6.6 | 0.620 | 62.0 | 0.182 + 0.017 | 37.8 | 32.8 | - | - | 35858828 | model | boxes | masks |
R-50-FPN | Mask | 1x | 2 | 8.6 | 0.889 | 22.2 | 0.099 + 0.019 | 37.7 | 33.9 | - | - | 35858933 | model | boxes | masks |
R-50-FPN | Mask | 2x | 2 | 8.6 | 0.897 | 44.9 | 0.099 + 0.018 | 38.6 | 34.5 | - | - | 35859007 | model | boxes | masks |
R-101-FPN | Mask | 1x | 2 | 10.2 | 1.008 | 25.2 | 0.126 + 0.018 | 40.0 | 35.9 | - | - | 35861795 | model | boxes | masks |
R-101-FPN | Mask | 2x | 2 | 10.2 | 0.993 | 49.7 | 0.126 + 0.017 | 40.9 | 36.4 | - | - | 35861858 | model | boxes | masks |
X-101-64x4d-FPN | Mask | 1x | 1 | 7.6 | 1.217 | 60.9 | 0.309 + 0.018 | 42.4 | 37.5 | - | - | 36494496 | model | boxes | masks |
X-101-64x4d-FPN | Mask | 2x | 1 | 7.6 | 1.210 | 121.0 | 0.309 + 0.015 | 42.2 | 37.2 | - | - | 35859745 | model | boxes | masks |
X-101-32x8d-FPN | Mask | 1x | 1 | 7.7 | 0.961 | 48.1 | 0.239 + 0.019 | 42.1 | 37.3 | - | - | 36761843 | model | boxes | masks |
X-101-32x8d-FPN | Mask | 2x | 1 | 7.7 | 0.975 | 97.5 | 0.240 + 0.016 | 41.7 | 36.9 | - | - | 36762092 | model | boxes | masks |
Notes:
backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-FPN | RetinaNet | 1x | 2 | 6.8 | 0.483 | 12.1 | 0.125 | 35.7 | - | - | - | 36768636 | model | boxes |
R-50-FPN | RetinaNet | 2x | 2 | 6.8 | 0.482 | 24.1 | 0.127 | 35.7 | - | - | - | 36768677 | model | boxes |
R-101-FPN | RetinaNet | 1x | 2 | 8.7 | 0.666 | 16.7 | 0.156 | 37.7 | - | - | - | 36768744 | model | boxes |
R-101-FPN | RetinaNet | 2x | 2 | 8.7 | 0.666 | 33.3 | 0.154 | 37.8 | - | - | - | 36768840 | model | boxes |
X-101-64x4d-FPN | RetinaNet | 1x | 2 | 12.6 | 1.613 | 40.3 | 0.341 | 39.8 | - | - | - | 36768875 | model | boxes |
X-101-64x4d-FPN | RetinaNet | 2x | 2 | 12.6 | 1.625 | 81.3 | 0.339 | 39.2 | - | - | - | 36768907 | model | boxes |
X-101-32x8d-FPN | RetinaNet | 1x | 2 | 12.7 | 1.343 | 33.6 | 0.277 | 39.5 | - | - | - | 36769563 | model | boxes |
X-101-32x8d-FPN | RetinaNet | 2x | 2 | 12.7 | 1.340 | 67.0 | 0.276 | 38.6 | - | - | - | 36769641 | model | boxes |
Notes: none
backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
X-152-32x8d-FPN-IN5k | Mask | s1x | 1 | 9.6 | 1.188 | 85.8 | 12.100 + 0.046 | 48.1 | 41.5 | - | - | 37129812 | model | boxes | masks |
[above without test-time aug.] | 0.325 + 0.018 | 45.2 | 39.7 | - | - |
Notes:
Common Settings for Keypoint Detection Baselines (That Differ from Boxes and Masks)
Our keypoint detection baselines differ from our box and mask baselines in a couple of details:
coco_2014_train
union coco_2014_valminusminival
that contain at least one person with keypoint annotations (all other images are discarded from the training set).coco_2014_minival
dataset).backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-FPN | RPN | 1x | 2 | 6.4 | 0.391 | 9.8 | 0.082 | - | - | - | 64.0 | 35998996 | model | props: 1, 2, 3 |
R-101-FPN | RPN | 1x | 2 | 8.1 | 0.504 | 12.6 | 0.109 | - | - | - | 65.2 | 35999521 | model | props: 1, 2, 3 |
X-101-64x4d-FPN | RPN | 1x | 2 | 11.5 | 1.394 | 34.9 | 0.289 | - | - | - | 65.9 | 35999553 | model | props: 1, 2, 3 |
X-101-32x8d-FPN | RPN | 1x | 2 | 11.6 | 1.104 | 27.6 | 0.224 | - | - | - | 66.2 | 36760438 | model | props: 1, 2, 3 |
Notes:
coco_2014_train
; "2" is coco_2014_valminusminival
; and "3" is coco_2014_minival
. These include all images, not just the ones with valid keypoint annotations.backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-FPN | Kps | 1x | 2 | 7.7 | 0.533 | 13.3 | 0.081 + 0.087 | 52.7 | - | 64.1 | - | 37651787 | model | boxes | kps |
R-50-FPN | Kps | s1x | 2 | 7.7 | 0.533 | 19.2 | 0.080 + 0.085 | 53.4 | - | 65.5 | - | 37651887 | model | boxes | kps |
R-101-FPN | Kps | 1x | 2 | 9.4 | 0.668 | 16.7 | 0.109 + 0.080 | 53.5 | - | 65.0 | - | 37651996 | model | boxes | kps |
R-101-FPN | Kps | s1x | 2 | 9.4 | 0.668 | 24.1 | 0.108 + 0.076 | 54.6 | - | 66.0 | - | 37652016 | model | boxes | kps |
X-101-64x4d-FPN | Kps | 1x | 2 | 12.8 | 1.477 | 36.9 | 0.288 + 0.077 | 55.8 | - | 66.7 | - | 37731079 | model | boxes | kps |
X-101-64x4d-FPN | Kps | s1x | 2 | 12.9 | 1.478 | 53.4 | 0.286 + 0.075 | 56.3 | - | 67.1 | - | 37731142 | model | boxes | kps |
X-101-32x8d-FPN | Kps | 1x | 2 | 12.9 | 1.215 | 30.4 | 0.219 + 0.084 | 55.4 | - | 66.2 | - | 37730253 | model | boxes | kps |
X-101-32x8d-FPN | Kps | s1x | 2 | 12.9 | 1.214 | 43.8 | 0.218 + 0.071 | 55.9 | - | 67.0 | - | 37731010 | model | boxes | kps |
Notes:
backbone | type | lr schd | im/ gpu | train mem (GB) | train time (s/iter) | train time total (hr) | inference time (s/im) | box AP | mask AP | kp AP | prop. AR | model id | download links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-FPN | Kps | 1x | 2 | 9.0 | 0.832 | 20.8 | 0.097 + 0.092 | 53.6 | - | 64.2 | - | 37697547 | model | boxes | kps |
R-50-FPN | Kps | s1x | 2 | 9.0 | 0.828 | 29.9 | 0.096 + 0.089 | 54.3 | - | 65.4 | - | 37697714 | model | boxes | kps |
R-101-FPN | Kps | 1x | 2 | 10.6 | 0.923 | 23.1 | 0.124 + 0.084 | 54.5 | - | 64.8 | - | 37697946 | model | boxes | kps |
R-101-FPN | Kps | s1x | 2 | 10.6 | 0.921 | 33.3 | 0.123 + 0.083 | 55.3 | - | 65.8 | - | 37698009 | model | boxes | kps |
X-101-64x4d-FPN | Kps | 1x | 2 | 14.1 | 1.655 | 41.4 | 0.302 + 0.079 | 56.3 | - | 66.0 | - | 37732355 | model | boxes | kps |
X-101-64x4d-FPN | Kps | s1x | 2 | 14.1 | 1.731 | 62.5 | 0.322 + 0.074 | 56.9 | - | 66.8 | - | 37732415 | model | boxes | kps |
X-101-32x8d-FPN | Kps | 1x | 2 | 14.2 | 1.410 | 35.3 | 0.235 + 0.080 | 56.0 | - | 66.0 | - | 37792158 | model | boxes | kps |
X-101-32x8d-FPN | Kps | s1x | 2 | 14.2 | 1.408 | 50.8 | 0.236 + 0.075 | 56.9 | - | 67.0 | - | 37732318 | model | boxes | kps |
Notes: