Data Samplers
CVNet offer data samplers with three sampling strategies:
Single-scale with fixed batch size (SSc-FBS)
Multi-scale with fixed batch size (MSc-FBS)
Multi-scale with variable batch size (MSc-VBS)
For details about these samplers, please see MobileViT paper.
Single-scale with fixed batch size (SSc-FBS)
This method is the default sampling strategy in most of the deep learning frameworks (e.g., PyTorch, Tensorflow, and MixNet) and libraries built on top of them (e.g., the timm library). At the \(t\)-th training iteration, this method samples a batch of \(b\) images per GPU [1] with a pre-defined spatial resolution of height \(H\) and width \(W\).
Multi-scale with fixed batch size (MSc-FBS)
The SSc-FBS method allows a network to learn representations at a single scale (or resolution). However, objects in the real-world are composed at different scales. To allow a network to learn representations at multiple scales, MSc-FBS extends SSc-FBS to multiple scales. Unlike the SSc-FBS method that takes a pre-defined spatial resolution as an input, this method takes a sorted set of $n$ spatial resolutions \(\mathcal{S} = \{ (H_1, W_1), (H_2, W_2), \cdots, (H_n, W_n)\}\) as an input. At the \(t\)-th iteration, this method randomly samples \(b\) images per GPU of spatial resolution \((H_t, W_t) \in \mathcal{S}\).
Multi-scale with variable batch size (MSc-VBS):
Networks trained using the MSc-FBS methods are more robust to scale changes as compared to SSc-FBS. However, depending on the maximum spatial resolution in \(\mathcal{S}\), MSc-FBS methods may have a higher peak GPU memory utilization (see Figure ref{fig:sampler_perf_cost}) as compared to SSc-FBS; causing out-of-memory errors on GPUs with limited memory. For example, MSc-FBS with \(\mathcal{S} = \{ (128, 128), (192, 192), (224, 224), (320, 320)\}\) and \(b=256\) would need about \(2\times\) more GPU memory (for images only) than SSc-FBS with a spatial resolution of \((224, 224)\) and \(b=256\). To address this memory issue, we extend MSc-FBS to variably-batch sizes. For a given sorted set of spatial resolutions \(\mathcal{S} = \{ (H_1, W_1), (H_2, W_2), \cdots, (H_n, W_n)\}\) and a batch size \(b\) for a maximum spatial resolution of \((H_n, W_n)\), a spatial resolution \((H_t, W_t) \in \mathcal{S}\) with a batch size of \(b_t = \frac{H_n W_n b}{H_t W_t}\) is sampled randomly at \(t\)-th training iteration on each GPU.
Variably-sized video sampler
These samplers can be easily extended for videos also. CVNet provides variably-sized sampler for videos, wherein researchers can control different video-related input variables (e.g., number of frames, number of clips per video, and video spatial resolution) for learning space- and time-invariant representations.
Data Sampler Objects
- class data.sampler.batch_sampler.BatchSampler(opts, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSampler
Standard Batch Sampler for data parallel. This sampler yields batches of fixed batch size and spatial resolutions.
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- class data.sampler.batch_sampler.BatchSamplerDDP(opts, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSamplerDDP
DDP variant of BatchSampler
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- class data.sampler.multi_scale_sampler.MultiScaleSampler(opts, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSampler
Multi-scale batch sampler for data parallel. This sampler yields batches of fixed batch size, but each batch has different spatial resolution.
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- class data.sampler.multi_scale_sampler.MultiScaleSamplerDDP(opts: Namespace, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSamplerDDP
DDP version of MultiScaleSampler
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- class data.sampler.variable_batch_sampler.VariableBatchSampler(opts: Namespace, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSampler
Variably-size multi-scale batch sampler <https://arxiv.org/abs/2110.02178?context=cs.LG>` for data parallel. This sampler yields batches with variable spatial resolution and batch size.
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- __init__(opts: Namespace, n_data_samples: int, is_training: bool = False, *args, **kwargs) None [source]
- class data.sampler.variable_batch_sampler.VariableBatchSamplerDDP(opts: Namespace, n_data_samples: int, is_training: bool = False, *args, **kwargs)[source]
Bases:
BaseSamplerDDP
DDP version of VariableBatchSampler
- Parameters:
opts – command line argument
n_data_samples – Number of samples in the dataset
is_training – Training or validation mode. Default: False
- __init__(opts: Namespace, n_data_samples: int, is_training: bool = False, *args, **kwargs) None [source]