cvnets.models.detection package

Subpackages

cvnets.models.detection.utils package

Submodules

cvnets.models.detection.base_detection module

class cvnets.models.detection.base_detection.BaseDetection(opts: Namespace, encoder: BaseImageEncoder, *args, **kwargs)[source]

Bases: BaseAnyNNModel

Base class for the task of object detection

Parameters:

opts – Command-line arguments
encoder – Image-encoder model (e.g., MobileNet or ResNet)

__init__(opts: Namespace, encoder: BaseImageEncoder, *args, **kwargs) → None[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser) → ArgumentParser[source]: Add model specific arguments

static reset_layer_parameters(layer: Module, opts: Namespace) → None[source]: Initialize weights of a given layer

classmethod build_model(opts: Namespace, *args, **kwargs) → BaseAnyNNModel[source]

Build a model from command-line arguments. Sub-classes must implement this method

Parameters:: opts – Command-line arguments

…note::: This function is typically implemented in the base class for each task and implementation is reused by all models in that task.

cvnets.models.detection.base_detection.check_feature_map_output_channels(config: Dict, layer_name: str) → int[source]

cvnets.models.detection.mask_rcnn module

class cvnets.models.detection.mask_rcnn.MaskRCNNEncoder(opts: Namespace, encoder: BaseImageEncoder, output_strides: List, projection_channels: int, encoder_lr_multiplier: float | None = 1.0, *args, **kwargs)[source]

Bases: Module

__init__(opts: Namespace, encoder: BaseImageEncoder, output_strides: List, projection_channels: int, encoder_lr_multiplier: float | None = 1.0, *args, **kwargs) → None[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

get_augmented_tensor() → Tensor[source]

forward(x: Tensor) → Dict[str, Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_trainable_parameters(weight_decay: float = 0.0, no_decay_bn_filter_bias: bool = False, *args, **kwargs) → Tuple[List, List][source]

class cvnets.models.detection.mask_rcnn.MaskRCNNDetector(opts, encoder: BaseImageEncoder, *args, **kwargs)[source]

Bases: BaseDetection

This class implements a Mask RCNN style object detector <https://arxiv.org/abs/1703.06870>

Parameters:

opts – command-line arguments
encoder (BaseImageEncoder) – Encoder network (e.g., ResNet or MobileViT)

__init__(opts, encoder: BaseImageEncoder, *args, **kwargs) → None[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

update_layer_norm_eps()[source]

set_norm_layer_opts()[source]

reset_norm_layer_opts(default_norm)[source]

classmethod add_arguments(parser: ArgumentParser) → ArgumentParser[source]: Add model specific arguments

reset_generalized_rcnn_transform(height, width)[source]

get_trainable_parameters(weight_decay: float = 0.0, no_decay_bn_filter_bias: bool = False, *args, **kwargs) → Tuple[List, List][source]

Get parameters for training along with the learning rate.

Parameters:

weight_decay – weight decay
no_decay_bn_filter_bias – Do not decay BN and biases. Defaults to False.

Returns:

Returns a tuple of length 2. The first entry is a list of dictionary with three keys (params, weight_decay, param_names). The second entry is a list of floats containing learning rate for each parameter.

Note

Kwargs may contain module_name. To avoid multiple arguments with the same name, we pop it and concatenate with encoder or head name

forward(x: Dict, *args, **kwargs) → Tuple[Tensor, ...] | Tuple[Any, ...] | Dict[source]: Implement the model-specific forward function in sub-classes.

predict(x: Tensor, *args, **kwargs) → DetectionPredTuple[source]: Predict the bounding boxes given an image tensor

dummy_input_and_label(batch_size: int) → Dict[source]: Create dummy input and labels for CI/CD purposes.

cvnets.models.detection.ssd module

class cvnets.models.detection.ssd.SingleShotMaskDetector(opts, encoder: BaseImageEncoder)[source]

Bases: BaseDetection

This class implements a Single Shot Object Detector

Parameters:

opts – command-line arguments
encoder (BaseImageEncoder) – Encoder network (e.g., ResNet or MobileViT)

coordinates = 4

__init__(opts, encoder: BaseImageEncoder) → None[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser) → ArgumentParser[source]: Add model specific arguments

static reset_layers(module) → None[source]

static process_anchors_ar(anchor_ar: List) → List[source]

get_backbone_features(x: Tensor) → Dict[str, Tensor][source]

ssd_forward(end_points: Dict[str, Tensor], device: device | None = device(type='cpu'), *args, **kwargs) → Tuple[Tensor, Tensor, Tensor] | Tuple[Tensor, ...][source]

forward(x: Tensor | Dict) → Tuple[Tensor, ...] | Tuple[Any, ...] | Dict[source]: Implement the model-specific forward function in sub-classes.

predict(x: Tensor, *args, **kwargs) → DetectionPredTuple[source]: Predict the bounding boxes given an image tensor

postprocess_detections(boxes: Tensor, scores: Tensor) → List[DetectionPredTuple][source]: Post process detections, including NMS

dummy_input_and_label(batch_size: int) → Dict[source]: Create dummy input and labels for CI/CD purposes.

Module contents

class cvnets.models.detection.DetectionPredTuple(labels, scores, boxes, masks)

Bases: tuple

boxes: Alias for field number 2

labels: Alias for field number 0

masks: Alias for field number 3

scores: Alias for field number 1