Migrate To-and-From torch/torchvision in Three Easy Steps#

  • 9 Minute Read

Summary#

  • In this tutorial, we will provide 3 easy steps to port a torch/torchvision codebase to use our package.

  • Similarly, we will show how to port back parameterized transforms to emulate their torch/torchvision-based counterparts.

From torch/torchvision to Parametrized Transforms#

  • Follow the notebook 005-Torch-Torchvision-to-Parametrized-Transforms.ipynb while going through the details of this section.

  • Note that the first cell of the notebook contains a training setup based on torch-dataset/dataloaders and torchvision augmentations.

  • Our goal is to modify this code so that the transforms are now by default parameterized and we catch the parameters of the augmentations as well!

  • This can be achieved in three very easy steps as follows–

Step 1. Update the Imports#

  • In particular, the original code imports transforms from torchvision.transforms.transforms, as follows–

#--------------------------------------------------------------------------------
# Old imports
#--------------------------------------------------------------------------------
...

import torchvision.transforms.transforms as transforms

...
  • Change the torchvision-based imports to parameterized_transforms imports. Also, we need to add parameterized_transforms.wrappers and parameterized_transforms.core for this conversion as shown below–

#--------------------------------------------------------------------------------
# New imports
#--------------------------------------------------------------------------------
...

import parameterized_transforms.transforms as transforms

# Extra imports.
import parameterized_transforms.wrappers as wrappers
import parameterized_transforms.core as core

...

Step 2. Convert the Training/Testing Augmentation Stacks.#

  • The old train and test transform stacks are as shown below–

#--------------------------------------------------------------------------------
# Old train/test augmentation stacks.
#--------------------------------------------------------------------------------

...

train_transform = transforms.Compose(
    [
        transforms.RandomHorizontalFlip(p=0.5),
        transforms.RandomApply(
            [
                transforms.ColorJitter(
                    brightness=0.1,
                    contrast=0.1,
                    saturation=0.1,
                    hue=0.1,
                )
            ],
            p=0.5,
        ),
        transforms.RandomGrayscale(p=0.1),
        transforms.RandomApply(
            [transforms.GaussianBlur(kernel_size=3, sigma=[0.1, 2.0])], p=0.5
        ),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
    ]
)

test_transform = transforms.Compose(
    [
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ]
)

...
  • Let us first convert the train_transform stack. For the train stack, we do NOT need to change anything but other than to wrap in the wrappers.CastParamsToTensor wrapping transform.

  • What the CastParamsToTensor wrapper achieves is that it converts the final parameters output by the transform stack into a torch-tensor of dtype=torch.float32. This is useful because ideally, from a dataloader, we would want parameters as tensor of shape [batch_size=B, num_params=P]. Originally, by design, the parameters generated by all parameterized transforms are of type tuple. However, tuple datatype can NOT be used by the default auto-collate function of torch-based loaders, whereas torch.Tensor instance can be automatically collated to achieve the desired effect. Thus, the CastParamsToTensor wrapper makes the collation of parameters into a tensor easy.

  • Also notice the subtle addition of the default_params_mode attribute for the transforms that allow it. In particular, we use ColorJitter transform with default_params_mode=core.DefaultParamsMode.RANDOMIZED configured. The RANDOMIZED mode enables for higher stochasticity in parameters during the training phase.

#--------------------------------------------------------------------------------
# New train augmentation stack
#--------------------------------------------------------------------------------

...

train_transform = wrappers.CastParamsToTensor(
    transform=transforms.Compose(
        [
            transforms.RandomHorizontalFlip(p=0.5),
            transforms.RandomApply(
                [
                    transforms.ColorJitter(
                        brightness=0.1,
                        contrast=0.1,
                        saturation=0.1,
                        hue=0.1,
                        default_params_mode=core.DefaultParamsMode.RANDOMIZED,
                    )
                ],
                p=0.5,
            ),
            transforms.RandomGrayscale(p=0.1),
            transforms.RandomApply(
                [transforms.GaussianBlur(kernel_size=3, sigma=[0.1, 2.0])], p=0.5
            ),
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
        ]
    )
)

...
  • Now, the interesting part is the conversion of the test augmentation stack! To understand how the test transformation stack is/should be/ought to be ported, we first give the solution!

#--------------------------------------------------------------------------------
# New test augmentation stack
#--------------------------------------------------------------------------------

...

test_transform = wrappers.CastParamsToTensor(
    wrappers.ApplyDefaultParams(
        transform=transforms.Compose(
            [
                transforms.RandomHorizontalFlip(p=0.5),
                transforms.RandomApply(
                    [
                        transforms.ColorJitter(
                            brightness=0.1,
                            contrast=0.1,
                            saturation=0.1,
                            hue=0.1,
                            default_params_mode=core.DefaultParamsMode.UNIQUE,
                        )
                    ],
                    p=0.5,
                ),
                transforms.RandomGrayscale(p=0.1),
                transforms.RandomApply(
                    [transforms.GaussianBlur(kernel_size=3, sigma=[0.1, 2.0])], p=0.5
                ),
                transforms.ToTensor(),
                transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
            ]
        )
    )
)

# #--------------------------------------------------------------------------------
# # The old augmentation stack for reference.
# #--------------------------------------------------------------------------------
# test_transform = transforms.Compose(
#     [
#         transforms.ToTensor(),
#         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
#     ]
# )

...
  • Wait, what!? It appears as if the test stack has entirely changed and now has all those core transforms from the train stack instead of just the ToTensor and Normalize core transforms from the old test stack. Let us understand this one step at a time.

  • Let us begin with the slightly philosophical question– what is the correct test augmentation stack in context of a given training augmentation stack? In the case of the old train and test stacks, both of them end with the ToTensor and Normalize core transforms but the train transform seems to have additional processing using different core transforms before this final processing. However, observe that in both the old and the new case, what the test transform stack is actually doing is applying the default parameters of each of the core transforms. In the case of the new test transform stack, this is explicit, whereas in the case of the old stack, it is implied!

    • In particular, all the following core transforms RandomHorizontalFlip, ColorJitter, RandomGrayscale, and GaussianBlur have default parameters that indeed preserve the image identity. Thus, if we decided to NOT have these core transforms, we would get the same identical image back as we would have gotten by having all these core transforms applying their default parameters.

    • For the final post-processing with the ToTensor and Normalize transforms, these transforms always change the image while preserving the information in it; the former changes the data type in which the image is represented to torch.Tensor and the latter just changes the range and scale of the data of the image, while preserving the image information! In such cases, by design, the application of default parameters also performs the same processing of the data! Since we also need to apply the operations on the test stack in order to keep the data in the consistent data type and in the correct range of values, we DO need to apply these core transforms in the test stack and it is indeed achieved by applying the default parameters of these transforms!

  • Thus, wrapping any augmentation stack with ApplyDefaultParams converts it into the corresponding test augmentation stack, and this is exactly what we are doing above!

    • A subtle detail is the change of default_params_mode=core.DefaultParamsMode.UNIQUE in ColorJitter of the test stack. We recommend this change in order to decrease the unnecessary stochasticity in parametrization during the testing phase and to get better testing performance.

  • Further, note that this strategy with ApplyDefaultParams also ensures that the number of parameters during the training and testing phase is the same and the parameters are indeed consistent. Note that, otherwise, direct conversion of the train/test stacks would result in different number of parameters during the train/test phases, despite being consistent from the point of view of processing of the data.

    • However, if you are not worried about preserving the number of parameters, please feel free to modify the test stack similar to the train stack. That too would work just fine!

  • Finally, we wrap the entire stack in CastParamsToTensor so that the parameters are converted from tuple data type to torch.Tensor, which can then be auto-collated with the default collate function in torch loaders.

  • Thus, to conclude, when we have our train stack, we basically wrap it in ApplyDefaultParams wrapper transform, with the subtlety of default_params_mode being changed to UNIQUE, and finally wrap this stack in the CastParamsToTensor wrapper. This gives us our desired test stack!

Step 3. Catch the Parameters from the Dataloader#

  • In any code, we catch the batch data from the dataloader inside the training/testing loop. In our example code, the batch data is being captured as shown below–

#--------------------------------------------------------------------------------
# Old fetching of the batch data.
#--------------------------------------------------------------------------------
...

for i, data in enumerate(trainloader, 0):
    ...
    # Get the batch.
    inputs, labels = data
    ...
...
        
# #--------------------------------------------------------------------------------
# # Alternative way where data is collated in dictionaries and fetched.
# #--------------------------------------------------------------------------------
# ...
# for idx, batch in enumerate(loader):
#     ...
#     images = batch["images"]
#     targets = batch["targets"]
#     ...
# ...
  • In this case, wherever we captured the inputs, we need to replace them with the tuple of the inputs and the corresponding augmentation parameter tensor, which is shown below–

#--------------------------------------------------------------------------------
# New fetching of the batch data.
#--------------------------------------------------------------------------------
...

for i, data in enumerate(trainloader, 0):
    ...
    # Get the batch.
    (inputs, params), labels = data
    ...
...

# #--------------------------------------------------------------------------------
# # New alternative way where data is collated in dictionaries and fetched.
# #--------------------------------------------------------------------------------
# ...
# for idx, batch in enumerate(loader):
#     ...
#     (images, params) = batch["images"]
#     targets = batch["targets"]
#     ...
# ...
  • Note that in this case, params will automatically be a torch.Tensor instance of dtype=torch.float32 and of shape [B, P], where B is the batch size and P is the number of parameters param_count

  • And that’s it! We have successfully ported the code to use the parameterized transforms and are ready to run our experiments!

From Parametrized Transforms to torch/torchvision#

  • If we have a transform implemented with the package, we can easily create its torchvision counterpart.

  • This can be done as follows using wrappers.DropParams wrapper transform. For the sake of example, let us consider the “crazy” transform from the previous tutorial. We simply wrap this transform with the wrappers.DropParams wrapper in order to emulate its torchvision “counterpart” with the same functionality. So, now we can pass in an image and get only an augmented image as output, without worrying about the parameterization!

#--------------------------------------------------------------------------------
# Parameterized transform example
#--------------------------------------------------------------------------------

parameterized_tx = ptx.RandomOrder(
    transforms=[
        # The first component is the `RandomSubsetApply` as seen in previous examples!
        RandomSubsetApply(
            transforms=[
                ParamNormRandomColorErasing(
                    tx_mode=ptc.TransformMode.CASCADE,
                    default_params_mode=ptc.DefaultParamsMode.RANDOMIZED
                ),
                ptx.RandomRotation(degrees=45)
            ],
            tx_mode=ptc.TransformMode.CASCADE,
        ),
        # The second component is the random choice-- either solarize or apply color-jitter!
        ptx.RandomChoice(
            transforms=[
                ptx.RandomSolarize(threshold=127, p=1.0),  # always solarize!
                ptx.ColorJitter(
                    brightness=0.8, contrast=0.8, saturation=0.8, hue=0.2
                )
            ],
            tx_mode=ptc.TransformMode.CASCADE,
        ),
        # The third component is an application of two different gaussian blurs!
        ptx.Compose(
            transforms=[
                ptx.GaussianBlur(kernel_size=21, sigma=[0.1, 2.0]),
                ptx.GaussianBlur(kernel_size=21, sigma=[0.1, 5.0])
            ]
        )
    ],
    tx_mode=ptc.TransformMode.CASCADE,
    default_params_mode=ptc.DefaultParamsMode.RANDOMIZED,
)


#--------------------------------------------------------------------------------
# Emulate `torchvision` transform
#--------------------------------------------------------------------------------
emulated_torchvision_tx = wrappers.DropParams(parameterized_tx)

# Usage, as expected, without worrying about parameterization--
# augmentation = emulated_torchvision_tx(image)

That’s All, Folks!#

  • And that’s it! This tutorial completes our Tutorial Series.

  • We hope that our package helps you in your research and experimentation! All the best!