data.datasets.utils package

Submodules

data.datasets.utils.common module

data.datasets.utils.common.file_has_valid_image_extension(filename: str) bool[source]
data.datasets.utils.common.file_has_allowed_extension(filename: str, extensions: str | Tuple[str, ...]) bool[source]

Checks if a file has an allowed extension.

Parameters:
  • filename – Path to a file.

  • extensions – A string or a tuple of strings specifying the file extensions.

Returns:

True if the filename ends with one of given extensions, else False

data.datasets.utils.common.get_image_paths(directory: str) List[str][source]

Returns a list of paths to all image files in the input directory and its subdirectories.

data.datasets.utils.common.select_random_subset(random_seed: int, num_total_samples: int, num_samples_to_select: int | None = None, percentage_of_samples_to_select: float | None = None) List[int][source]

Randomly selects a subset of samples.

Only one of num_samples_to_select and percentage_of_samples_to_select should be provided. Selects all the samples if neither of them are provided.

Parameters:
  • random_seed – An integer seed to use for random selection.

  • num_total_samples – Total number of samples in the set that is being subsampled.

  • num_samples_to_select – An optional integer indicating the number of samples to select.

  • percentage_of_samples_to_select – An optional float in the range (0,100] indicating the percentage of samples to select.

Returns:

A list of (integer) indices of the selected samples.

Raises:

ValueError if both num_samples_to_select and percentage_of_samples_to_select are provided.

data.datasets.utils.common.select_samples_by_category(sample_category_labels: List[Any], random_seed: int, num_samples_per_category: int | None = None, percentage_of_samples_per_category: float | None = None) List[int][source]

Randomly selects a specified number/percentage of samples from each category.

Only one of num_samples_per_category and percentage_of_samples_per_category should be provided. Selects all the samples if neither of them are provided.

Parameters:
  • sample_category_labels – A list of category labels.

  • random_seed – An integer seed to use for random selection.

  • num_samples_per_category – An optional integer indicating the number of samples to select from each category.

  • percentage_of_samples_per_category – An optional float in the range (0, 100] indicating the percentage of samples to select from each category.

Returns:

A list of (integer) indices of the selected samples.

Raises:

ValueError if both num_samples_per_category and percentage_of_samples_per_category are provided.

data.datasets.utils.text module

data.datasets.utils.text.caption_preprocessing(caption: str) str[source]

Removes the unwanted tokens (e.g., HTML tokens, next line, unwanted spaces) from the text.

data.datasets.utils.video module

Contains helper functions for reading from video detection datasets.

NOTE: Annotations are stored via a @rectangles_dict of the form:
Dict:
key -> identity:
Annotation list of Dicts for different timestamps:
timestamp (float): The timestamp representing the seconds since the

video began, ex. 1.2 is 1.2 seconds into the video.

x0 (float): Normalized pixel space coordinate of top left of

bounding box.

y0 (float): Normalized pixel space coordinate of top left of

bounding box.

x1 (float): Normalized pixel space coordinate of bottom right of

bounding box.

y1 (float): Normalized pixel space coordinate of bottom right of

bounding box.

<class_label_name> (int): Label of the class. The key to

this field depends on the dataset.

is_visible (bool): []Optional] Whether bounding box is

visible.

See tests/data/datasets/utils/video_test.py for an example of this dictionary.

data.datasets.utils.video.fetch_labels_from_timestamps(class_label_name: str, timestamps: List[float], rectangles_dict: Dict[str, List[Dict[str, Any]]], interpolation_cutoff_threshold_sec: float | None = None, progressible_labels: Collection[int] | None = None, carry_over_keys: List[str] | None = None, required_keys: List[str] | None = None) Dict[str, List[Dict[str, Any]]][source]

Returns object labels for the specified video frame timestamps.

The result will retain the structure of rectangles_dict, but just ensure that the timestamp values are as requested.

If progressible_labels are supplied, the “progress” field will be included. This field represents the ‘normalized’ amount of time that the class label has existed temporally. See tests/data/datasets/utils/test_video.py:test_fetch_frame_with_progress for examples.

This fetching function can be used for (per-frame) video classification pipelines.

Parameters:
  • class_label_name – The field name in rectangles_dict that maps to the class label.

  • timestamps – A list of timestamps to fectch label from.

  • rectangles_dict – (See docstring at top of file.)

  • interpolation_cutoff_threshold_sec – Threshold under which we allow interpolation. In some `rectangles_dict`s, the labels (within the same track) are so far apart (e.g. 10 seconds) that interpolation is non-sensical. Thus this value prevents unrelated labels from being interpolated.

  • progressible_labels – Set of labels for which to calculate “progress” for the resulting bounding boxes. If None, no “progress” field will be included.

  • carry_over_keys – A list of keywords that specifies which keys should be carried over from the previous rectangle during interpolation. Defaults to None.

  • required_keys – A list of keywords that specifies which keywords need to be included in a new bounding_box in addition to the @class_label_name. Defaults to None.

Returns:

Dict containing the labels, still indexable by track id.

Module contents