cvnets.image_projection_layers package

Submodules

cvnets.image_projection_layers.attention_pool_2d module

class cvnets.image_projection_layers.attention_pool_2d.AttentionPool2dHead(opts, in_dim: int, out_dim: int, *args, **kwargs)[source]

Bases: BaseImageProjectionHead

This class implements attention pooling layer, as described in Clip, and should be used for CNN-style models, including MobileViTs

__init__(opts, in_dim: int, out_dim: int, *args, **kwargs) None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser)[source]

Add model specific arguments

reset_parameters()[source]

Reset weights of a given layer

forward(x: Tensor, *args, **kwargs) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

cvnets.image_projection_layers.base_image_projection module

class cvnets.image_projection_layers.base_image_projection.BaseImageProjectionHead(opts, *args, **kwargs)[source]

Bases: Module

Base class that projects image representations to the same space as text representations

__init__(opts, *args, **kwargs) None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser)[source]

Add model specific arguments

reset_parameters() None[source]

Reset weights of a given layer

get_trainable_parameters(weight_decay: float | None = 0.0, no_decay_bn_filter_bias: bool | None = False, *args, **kwargs)[source]
forward(input: Dict, *args, **kwargs) Dict[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

cvnets.image_projection_layers.base_image_projection.get_in_feature_dimension(image_classifier: Module) int[source]

Return the input feature dimension to the image classification head.

cvnets.image_projection_layers.global_pool_2d module

class cvnets.image_projection_layers.global_pool_2d.GlobalPool2D(opts, in_dim: int, out_dim: int, *args, **kwargs)[source]

Bases: BaseImageProjectionHead

This class implements global pooling with linear projection

__init__(opts, in_dim: int, out_dim: int, *args, **kwargs) None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser) ArgumentParser[source]

Add model specific arguments

reset_parameters()[source]

Reset weights of a given layer

forward(x: Tensor, *args, **kwargs) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

cvnets.image_projection_layers.simple_projection_head module

class cvnets.image_projection_layers.simple_projection_head.SimpleImageProjectionHead(opts, in_dim: int, out_dim: int, *args, **kwargs)[source]

Bases: BaseImageProjectionHead

This class implements simple projection head

__init__(opts, in_dim: int, out_dim: int, *args, **kwargs) None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

classmethod add_arguments(parser: ArgumentParser) ArgumentParser[source]

Add model specific arguments

reset_parameters()[source]

Reset weights of a given layer

forward(x: Tensor, *args, **kwargs) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Module contents

cvnets.image_projection_layers.arguments_image_projection_head(parser: ArgumentParser) ArgumentParser[source]

Register arguments of all image projection heads.

cvnets.image_projection_layers.build_image_projection_head(opts: Namespace, in_dim: int, out_dim: int, *args, **kwargs) BaseImageProjectionHead[source]

Helper function to build an image projection head from command-line arguments.

Parameters:
  • opts – Command-line arguments

  • in_dim – Input dimension to the projection head.

  • out_dim – Output dimension of the projection head.

Returns:

Image projection head module.