Config Files: Introduction and Walkthrough
Config files in CVNet are stored as YAML files and stored under config/<task>
directories.
They contain the hyper-parameters used for training/validating the respective models.
Let us take a step-by-step look at config/classification/imagenet/resnet.yaml
which is used to train a ResNet-50 model on the ImageNet-1k dataset using a single node with 8 A100 GPUs:
Dataset
The configs under dataset
define which dataset to train (dataset.name
) and where the data is located on disk (dataset.root_train
and dataset.root_val
).
It also contains information about the train/val batch sizes and the number of workers and how to handle GPU memory. Note that the effective batch size is train_batch_size0 * num_gpus * gradient accum. freq.
dataset:
root_train: "/mnt/imagenet/training"
root_val: "/mnt/imagenet/validation"
name: "imagenet"
category: "classification"
train_batch_size0: 128
val_batch_size0: 100
eval_batch_size0: 100
workers: 8
persistent_workers: true
pin_memory: true
Data augmentation
The image_augmentation
configs define the data augmentations to use during training. In below example, we use Inception-style augmentation.
For advanced image augmentation example, see ResNet-50’s advanced recipe.
image_augmentation:
random_resized_crop:
enable: true
interpolation: "bicubic"
random_horizontal_flip:
enable: true
resize:
enable: true
size: 256 # shorter size is 256
interpolation: "bicubic"
center_crop:
enable: true
size: 224
Sampler
The sampler
configs define which Data Sampler type to use as well as information about the crop width and height. In this example, we train ResNet-50 with variably-sized batch sampler, introduced in MobileViT.
sampler:
name: "variable_batch_sampler"
vbs:
crop_size_width: 224
crop_size_height: 224
max_n_scales: 5
min_crop_size_width: 128
max_crop_size_width: 320
min_crop_size_height: 128
max_crop_size_height: 320
check_scale: 32
Optimizer and LR scheduler
The optim
and scheduler
configs define the optimizer and LR scheduler hyper-parameters. Here we used SGD with a Consine learning rate with warm-up.
optim:
name: "sgd"
weight_decay: 1.e-4
no_decay_bn_filter_bias: true
sgd:
momentum: 0.9
scheduler:
name: "cosine"
is_iteration_based: false
max_epochs: 150
warmup_iterations: 7500
warmup_init_lr: 0.05
cosine:
max_lr: 0.4
min_lr: 2.e-4
Model
model
defines the model type as well as the model hyper-parameters. Here Used a ResNet-50
model for a classification task.
model:
classification:
name: "resnet"
activation:
name: "relu"
resnet:
depth: 50
normalization:
name: "batch_norm"
momentum: 0.1
activation:
name: "relu"
inplace: true
layer:
global_pool: "mean"
conv_init: "kaiming_normal"
linear_init: "normal"
EMA and Training statistics
CVNet allows you to keep an exponentially moving average version of the model by simply setting ema.enable = True
. Last but not least, stats
defines which metrics to compute and report for the model. The best model is kept based on its checkpoint_metric
value.
ema:
enable: true
momentum: 0.0005
stats:
val: [ "loss", "top1", "top5" ]
train: ["loss"]
checkpoint_metric: "top1"
checkpoint_metric_max: true