<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2877026&amp;fmt=gif">

Simplify Pytorch With A Standard Operating Procedure

By Aparna Kesarkar - June 2, 2021

Constructing a Standard Operating Procedure to help build a Pytorch model

Pytorch is a Machine Learning library that exemplifies two important attributes: usability and speed in Deep Learning Frameworks. In its design framework, Pytorch is described as a “Pythonic” language, i.e. it can be easily used by Python Developers.

Though Pytorch makes it easy to build deep learning models, the model building process has a number of components to itself that make it cumbersome.

This blog focusses on simplifying the Pytorch framework by setting a Standard Operating Procedure(SOP). SOPs always help in de-cluttering our understanding of frameworks. Here we will concentrate on the aspects that can be generalized for all Pytorch models.

Let’s divide the model building process into the following high-level steps:

Step 1: Installing Pytorch
Step 2: Loading and Preprocessing Data
Step 3: Training
Step 4: Evaluation

Step 1: Installing Pytorch

Depending on the type of dataset one may want to use, we can install different PyTorch packages, some of them being torchvision, torchaudio, and torchtext. All these packages can be installed using Python’s pip install process. For this blog, we will continue with the torchvision package used for image datasets.

Step 2: Loading and Preprocessing Data

Code for processing sample data can get messy and hard to maintain. We ideally want our dataset preprocessing and loading code to be decoupled from our model training code for better readability and modularity.

PyTorch provides two data primitives: torch.utils.data.DataLoader and torch.utils.data.Dataset that allows you to use pre-loaded datasets offered by PyTorch or load our own data. We will talk more about these primitives in step 2.3. We will start with importing data in step 2.1

Step 2.1 — Import Data

In the following code, we have defined a path to the train and test image directories respectively. This path can be a URL, local path, or cloud path.

train_folder = “data/train”
test_folder = “data/test”

In the case where this is a URL, this path would be from the connected Google Drive, and the model would be running on Google Colab.

An important and sometimes time-consuming part of data preparation is reading and assigning labels to images. Label names can be provided:

  • in a separate CSV file

  • or the file names of images can represent label names

  • or images can be stored in separate folders and each folder can represent a label

This list is not exhaustive. But in all the above cases it is a matter of Python development to get our data in Pytorch usable format. Once our dataset is loaded, we move to pre-processing it.

Step 2.2 — Preprocess the Images

In this step, one may resize, flip or change the color scale of an image using Pytorch’s Compose module. While resizing be mindful of the original image size, reducing the size of an extremely large image may result in loss of important information, and increasing the size of very small images may enhance features to the extent that they become meaningless.

This step is the same as the feature engineering process in machine learning. Here we enhance relevant features so that the model learns better and quicker. This step is subjective to datasets, say, the color of the images does not matter, then the color channels can be changed to two instead of three. As the model will learn faster with two channels as compared to three. If the images are very thin and the objective is to localize specific sections, one can use smudging, blurring, dilation, etc. to enhance relevant features.

These are just a few of the many techniques one may consider during preprocessing. The basic idea is to help the model learn better and faster.

To make the changes mentioned above, Pytorch offers torchvision.transforms chained with the Compose module that allows users to create an image transformation pipeline for the input image dataset. One can read in detail about all the available transformations here.

Make sure to apply transformations on both your train and testing datasets. An example of a transformation pipeline is as follows.

import torchvision
preprocess_pipe = torchvision.transforms.Compose([
torchvision.transforms.Resize((256, 256)),
torchvision.transforms.ColorJitter(brightness=0.3, contrast=0.5, saturation=0.4, hue=0.2),

Step 2.3 — Loading/Batching Data to Ingest in Pytorch Models

At the beginning of this section, we spoke briefly about Pytorch’s Dataset and DataLoader modules. Let’s understand them better.

The Dataset class stores the samples and their corresponding labels, while the DataLoader class wraps an iterable around the Dataset to enable easy access to the samples.

Dataset retrieves our data’s features and labels, one sample at a time. But, while training a model, we typically want to pass these samples in “mini-batches”, and reshuffle the data at every epoch to reduce model overfitting. DataLoader is iterable that abstracts this complexity for us in an easy API.

This can be seen in the code below.

from torch.utils.data import DataLoader

train_data = torchvision.datasets.ImageFolder(root=train_folder, 

train_data_loader = data.DataLoader(train_data, batch_size=64,
shuffle=True, num_workers=4)

test_data = torchvision.datasets.ImageFolder(root=test_folder, 

test_data_loader = data.DataLoader(test_data, batch_size=64, 
shuffle=True, num_workers=4)

For better understanding think of the Dataset module as the superset with all the data, while DataLoader has records bunched up based on parameters such as batch_size, sampler, batch_sampler, and shuffle.

Now that we have understood data loading, let’s move to build a Pytorch model.

Step 3: Training

Step 3.1 — Define Layers

Here we first define layers and then arrange them to form a model. We can think of the layers defined in the constructor as lego blocks and the forward method as building a lego toy.

In this step, we can either define an entire model from scratch or use an existing model as the backbone (transfer learning). This part can be thought of as the model recipe.

Let’s take a look at a simple fully connected model with two layers. One iteration of step 3.1 is called forward propagation, that is, the data has traveled from the input layer to the output layer, once.

import torch
class TwoLayerNet(torch.nn.Module):

   def __init__(self, D_in, H, D_out):
      In the constructor we instantiate two nn.Linear modules and 
assign them as member variables.
       super(TwoLayerNet, self).__init__()
       self.linear1 = torch.nn.Linear(D_in, H)
       self.linear2 = torch.nn.Linear(H, D_out)
    def forward(self, x):
       In the forward function we accept a Tensor of input data and
we must return a Tensor of output data. We can use Modules defined in
the constructor as well as arbitrary operators on Tensors.
       h_relu = self.linear1(x).clamp(min=0)
       y_pred = self.linear2(h_relu)
       return y_pred
# D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

D_in, H, D_out = 1000, 100, 10

# Construct our model by instantiating the class defined above

model = TwoLayerNet(D_in, H, D_out)

Extending the lego analogy to this model; layers, linear1 and linear2 are the lego blocks and the forward method uses linear1 at the beginning along with Relu activation function and linear2 as the output layer is the lego toy made out of the blocks

(Note that one can use linear2 instead of linear1 in the forward function and it would not make a difference)

Step 3.2 — Define Optimizer and Loss Function

The Deep Learning algorithm iteratively learns from the given data. With each iteration, it compares the model output to given labels, calculates the loss, and adjusts the model weights to improve results. To calculate the loss for each epoch it requires a loss function and to adjust weights on the nodes requires an optimizer. We can define both of these parameters as follows.

# Construct our loss function and an Optimizer. The call to 

# in the SGD constructor will contain the learnable parameters of the 

# nn.Linear modules which are members of the model.

criterion = torch.nn.MSELoss(reduction=’sum’)

optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

There are multiple options available from which one can choose an optimizer and loss function. Since this blog concentrates on putting together a skeletal structure, details about the various optimizers and loss functions aren’t covered in depth.

Step 3.3 — Train for N Epochs and Save Model

In step 3.1 and step 3.2 we’ve put together modules required to run one epoch. These modules are to be run n times for n epochs. Over each epoch the model adjusts its weights, aiming to improve.

epochs = 500

for epoch in epochs:

   # Forward pass: Compute predicted y by passing x to the model
   for i, data in enumerate(train_data_loader, 0):
      inputs, labels = data
      # zero the parameter gradients
      # forward + backward + optimize
      outputs = model(inputs)
      loss = criterion(outputs, labels)
   # print statistics at the end of each epoch
   print(“epoch: {} loss: {}“.format(epoch + 1, loss)
print(‘Finished Training’)

# saving model

torch.save(model.state_dict(), twolayermodel.pt’)

Step 4: Model Evaluation

Now that we have saved our trained model we can use it to predict values for unseen data or test sets. By default models is set to train, hence while evaluating we need to set our model to eval() mode. Some deep learning layers like BatchNorm and Dropout behave differently in train vs eval, hence this is necessary.


for i, data in enumerate(test_data_loader, 0):

   inputs, labels = data
   predictions = model(inputs)
   loss = criterion(predictions, labels)

And that is all the steps necessary to create a Pytorch model.


There are multiple ways of arranging the steps mentioned above, one can define both train and eval in the same module and use either of them based on specific conditions. That depends majorly on preferences with regard to Python development. The basic skeleton of Pytorch remains the same. Ideally, one should keep it modular so that each module can be reused by changing parameters or by adding more complexity.

Understanding all the components involved in Pytorch model building is the step towards creating a good model. Hope this blog helped in simplifying the framework. To get the best artificial intelligence solutions for your business, reach out to us at Clairvoyant.


Pytorch Documentation — https://pytorch.org/docs/stable/index.html


Aparna Kesarkar

Tags: Artificial Intelligence

Fill in your Details