Model Zoo
⚠️ For recent papers along with pre-trained models, training/evaluation recipes, and configuration files, please see examples folder. We will update model zoo periodically.⚠️
This file contains the links to all the pre-trained models in CVNets and their configs:
Classification (ImageNet-1k)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
ViT-tiny |
5.7 M |
72.91 |
|||
ResNet-34 |
21.8 M |
74.85 |
|||
ResNet-50 |
25.6 M |
78.44 |
|||
ResNet-101 |
44.5 M |
79.81 |
|||
MobileNetv1-0.25 |
0.5 M |
54.45 |
|||
MobileNetv1-0.5 |
1.3 M |
65.93 |
|||
MobileNetv1-0.75 |
2.6 M |
71.44 |
|||
MobileNetv1-1.00 |
4.2 M |
74.04 |
|||
MobileNetv2-0.25 |
1.5 M |
53.57 |
|||
MobileNetv2-0.5 |
2.0 M |
65.28 |
|||
MobileNetv2-0.75 |
2.6 M |
70.42 |
|||
MobileNetv2-1.00 |
3.5 M |
72.93 |
|||
MobileNetv3-small |
2.5 M |
66.65 |
|||
MobileNetv3-large |
5.4 M |
75.13 |
|||
ResNet-34 (advanced recipe) |
21.8 M |
76.91 |
|||
ResNet-50 (advanced recipe) |
25.6 M |
80.36 |
|||
ResNet-101 (advanced recipe) |
44.5 M |
81.68 |
MobileViTv1 (Legacy)
Note: These resutls are from CVNets v0.1. We discontinued the support of OpenCV and switched to PIL in v0.2. For MobileViTv1 results, see v0.1.
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
---|---|---|---|---|
MobileViT-XXS |
1.3 M |
69.0 |
||
MobileViT-XS |
2.3 M |
74.7 |
||
MobileViT-S |
5.6 M |
78.3 |
MobileViTv2 (256x256)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
MobileViTv2-0.5 |
1.4 M |
70.18 |
|||
MobileViTv2-0.75 |
2.9 M |
75.56 |
|||
MobileViTv2-1.0 |
4.9 M |
78.09 |
|||
MobileViTv2-1.25 |
7.5 M |
79.65 |
|||
MobileViTv2-1.5 |
10.6 M |
80.38 |
|||
MobileViTv2-1.75 |
14.3 M |
80.84 |
|||
MobileViTv2-2.0 |
18.4 M |
81.17 |
MobileViTv2 (Trained on 256x256 and Finetuned on 384x384)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
MobileViTv2-0.5 |
1.4 M |
72.14 |
|||
MobileViTv2-0.75 |
2.9 M |
76.98 |
|||
MobileViTv2-1.0 |
4.9 M |
79.68 |
|||
MobileViTv2-1.25 |
7.5 M |
80.94 |
|||
MobileViTv2-1.5 |
10.6 M |
81.50 |
|||
MobileViTv2-1.75 |
14.3 M |
82.04 |
|||
MobileViTv2-2.0 |
18.4 M |
82.17 |
MobileViTv2 (Trained on ImageNet-21k and Finetuned on ImageNet-1k 256x256)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
MobileViTv2-1.5 |
10.6 M |
81.46 |
|||
MobileViTv2-1.75 |
14.3 M |
81.94 |
|||
MobileViTv2-2.0 |
18.4 M |
82.36 |
MobileViTv2 (Trained on ImageNet-21k, Finetuned on ImageNet-1k 256x256, and Finetuned on ImageNet-1k 384x384)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
MobileViTv2-1.5 |
10.6 M |
82.60 |
|||
MobileViTv2-1.75 |
14.3 M |
82.93 |
|||
MobileViTv2-2.0 |
18.4 M |
83.41 |
Object Detection (MS-COCO)
Model |
Parameters |
MAP |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
SSD ResNet-50 |
28.5 M |
29.98 |
|||
SSD MobileViTv2-0.5 |
2.0 M |
21.24 |
|||
SSD MobileViTv2-0.75 |
3.6 M |
24.57 |
|||
SSD MobileViTv2-1.0 |
5.6 M |
26.47 |
|||
SSD MobileViTv2-1.25 |
8.2 M |
27.85 |
|||
SSD MobileViTv2-1.5 |
11.3 M |
28.83 |
|||
SSD MobileViTv2-1.75 |
14.9 M |
29.52 |
|||
SSD MobileViTv2-2.0 |
19.1 M |
30.21 |
Segmentation
Note: The number of parameters reported does not include the auxiliary branches.
ADE20K Dataset
Model |
Parameters |
mIoU |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
DeepLabv3 MobileNetv2 |
8.0 M |
35.20 |
|||
PSPNet MobileViTv2-0.5 |
3.6 M |
31.77 |
|||
PSPNet MobileViTv2-0.75 |
6.2 M |
35.22 |
|||
PSPNet MobileViTv2-1.0 |
9.4 M |
36.57 |
|||
PSPNet MobileViTv2-1.25 |
13.2 M |
38.76 |
|||
PSPNet MobileViTv2-1.5 |
17.6 M |
38.74 |
|||
PSPNet MobileViTv2-1.75 |
22.5 M |
39.82 |
|||
DeepLabv3 MobileViTv2-0.5 |
6.3 M |
31.93 |
|||
DeepLabv3 MobileViTv2-0.75 |
9.6 M |
34.70 |
|||
DeepLabv3 MobileViTv2-1.0 |
13.4 M |
37.06 |
|||
DeepLabv3 MobileViTv2-1.25 |
17.7 M |
38.42 |
|||
DeepLabv3 MobileViTv2-1.5 |
22.6 M |
38.91 |
|||
DeepLabv3 MobileViTv2-1.75 |
28.1 M |
39.53 |
|||
DeepLabv3 MobileViTv2-2.0 |
34.0 M |
40.94 |
Pascal VOC 2012 Dataset
Model |
Parameters |
mIoU |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
DeepLabv3 MobileViTv1 |
8.5 M |
79.44 |
|||
PSPNet MobileViTv2-0.5 |
3.6 M |
74.62 |
|||
PSPNet MobileViTv2-0.75 |
6.2 M |
77.44 |
|||
PSPNet MobileViTv2-1.0 |
9.4 M |
78.92 |
|||
PSPNet MobileViTv2-1.25 |
13.2 M |
79.40 |
|||
PSPNet MobileViTv2-1.5 |
17.5 M |
79.93 |
|||
DeepLabv3 MobileViTv2-0.5 |
6.2 M |
75.07 |
|||
DeepLabv3 MobileViTv2-1.0 |
13.3 M |
78.94 |
|||
DeepLabv3 MobileViTv2-1.25 |
17.7 M |
79.68 |
|||
DeepLabv3 MobileViTv2-1.5 |
22.6 M |
80.30 |
Video Classification (Kinetics-400)
Model |
Parameters |
Top-1 |
Pretrained weights |
Config file |
Logs |
---|---|---|---|---|---|
MobileViTv1-small-SpatioTemporal |
5.2 M |
68.38 |