PyTorch Hyperparameter Tuning

In this Python tutorial, we will learn about the PyTorch Hyperparameter tuning in python to build a difference between an average and highly accurate model. And additionally, we will also cover different examples related to PyTorch Hyperparameter tuning. And we will also cover these topics.

  • PyTorch hyperparameter tuning
  • PyTorch lightning hyperparameter tuning
  • PyTorch geometric hyperparameter tuning

PyTorch hyperparameter tuning

In this section, we will learn about the PyTorch hyperparameter tuning in python.

The PyTorch hyperparameter tunning is defined as a process that makes the difference between an average model or a highly accurate model.

Code:

In the following code, we will import all the necessary libraries such as import torch, import matplotlib.pyplot , import numpy as np, import torch.nn.functional as func, import nn from torch, import datasets,transforms from torchvision, import request, import Image from PIL. 

  • transform=transforms.Compose([transforms.Resize((32,32)),transforms.ToTensor(), transforms.Normalize((0.7, 0.7, 0.7), (0.7, 0.7, 0.7))]): The transforms.compose() function allows us to chain multiple transforms together which is passed as a first argument of composed.
  • traindataset=datasets.CIFAR10(root=’./data’,train=True,download=True,transform=transform) is used to load the CIFAR dataset.
  • trainloader=torch.utils.data.DataLoader(dataset=traindataset,batch_size=100,shuffle=True) is used declaring the training loaders.
  • classes=(‘plane’,’car’,’bird’,’cat’,’dear’,’dog’,’frog’,’horse’,’ship’,’truck’): Here we are declaring the list of classes.
  • dataiter=iter(trainloader) is used to get some random training images.
  • figure=plot.figure(figsize=(25,4)) is used to plot the figures.
  • class LeNet(nn.Module): is used to create a model class with the help of init and forward() methods.
  • model=LeNet().to(device) is used to creating an instance of the model.
  • criteron=nn.CrossEntropyLoss() is used to define a loss function.
  • optimizer=torch.optim.Adam(model.parameters(),lr=0.001) is used to initialize the optimizer.
  • loss1=criteron(outputs,labels) is used to calculate the total categorical cross-entropy loss.
  • _,preds=torch.max(outputs,1) is used to find the accuracy of our network.
  • loss+=loss1.item() is used to keep the track of the losses at every epoch.
  • print(‘validation_loss:{:.4f},{:.4f}’.format(valepoch_loss,valepoch_acc.item())) is used to print the validation loss and validation accuracy.
# Importing Libraries
import torch  
import matplotlib.pyplot as plot  
import numpy as np  
import torch.nn.functional as func  
from torch import nn  
from torchvision import datasets,transforms   
import requests  
from PIL import Image 

# Using the device
device=torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  
# Using Compose() method of transform
transform = transforms.Compose(
    [transforms.Resize((32,32)),transforms.ToTensor(),
     transforms.Normalize((0.7, 0.7, 0.7), (0.7, 0.7, 0.7))])
# Load the CIFAR10 train and validate dataset
traindataset=datasets.CIFAR10(root='./data',train=True,download=True,transform=transform)  
valdataset=datasets.CIFAR10(root='./data',train=False,download=True,transform=transform)  

# Declaring the train loaders and validate loaders
trainloader=torch.utils.data.DataLoader(dataset=traindataset,batch_size=100,shuffle=True)  
valloader=torch.utils.data.DataLoader(dataset=valdataset,batch_size=100,shuffle=False)

# Declare the list of classes
classes=('plane','car','bird','cat','dear','dog','frog','horse','ship','truck')  

# Functions to show an images
def im_convert(tensor):  
    image=tensor.cpu().clone().detach().numpy()  
    image=image.transpose(1,2,0)  
    print(image.shape)  
    image=image*(np.array((0.5,0.5,0.5))+np.array((0.5,0.5,0.5)))  
    image=image.clip(0,1)  
    return image 
    # Get some random training images
dataiter=iter(trainloader)  
images,labels=dataiter.next()  
# Plot the figures
figure=plot.figure(figsize=(25,4)) 
for idx in np.arange(20):  
    axis=figure.add_subplot(2,10,idx+1)  
    plot.imshow(im_convert(images[idx]))  
    axis.set_title(classes[labels[idx].item()])   

# Create a model class
class LeNet(nn.Module):  
        def __init__(self):  
            super().__init__()  
            self.conv1=nn.Conv2d(3,16,3,1, padding=1)  
            self.conv2=nn.Conv2d(16,32,3,1, padding=1)  
            self.conv3=nn.Conv2d(32,64,3,1, padding=1)     
            self.fully1=nn.Linear(4*4*64,500)  
            self.dropout1=nn.Dropout(0.5)   
            self.fully2=nn.Linear(500,10)  
        def forward(self,y):  
            y=func.relu(self.conv1(y))  
            y=func.max_pool2d(y,2,2)  
            y=func.relu(self.conv2(y))  
            y=func.max_pool2d(y,2,2)  
            y=func.relu(self.conv3(y))  
            y=func.max_pool2d(y,2,2) 
            #Reshaping the output into desired shape   
            y=y.view(-1,4*4*64) 
            #Applying relu activation function to our first fully connected layer 
            y=func.relu(self.fully1(y))  
            y=self.dropout1(y)  
            #We will not apply activation function here because we are dealing with multiclass dataset  
            y=self.fully2(y)    
            return y      

# Creating an instance of the model
model=LeNet().to(device) 

# Define Loss Function
criteron=nn.CrossEntropyLoss()  

# Use the optimizer
optimizer=torch.optim.Adam(model.parameters(),lr=0.001)

# Specify the number of epochs
epochs=5  
losshistory=[]  
correcthistory=[]  
valloss_history=[]  
valcorrect_history=[]  

# Validate the model
for e in range(epochs):  
    loss=0.0  
    correct=0.0  
    valloss=0.0  
    valcorrect=0.0  
    # Define a for loop statement for labels and inputs
    for input,labels in trainloader:  
        # Itetrate through our batch of the images 
        input=input.to(device)  
        labels=labels.to(device)  
        outputs=model(input)  
        # Calculate the total categorical cross entropy loss
        loss1=criteron(outputs,labels)  
        optimizer.zero_grad()  
        loss1.backward()  
        optimizer.step()  
        # Find the accuracy of our network
        _,preds=torch.max(outputs,1) 
        # Keep the track of the losses at every epochs 
        loss+=loss1.item()  
        correct+=torch.sum(preds==labels.data)  
    else:  
        with torch.no_grad():  
            # Define a for loop statement for labels and inputs
            for val_input,val_labels in valloader:  
                # Itetrate through our batch of the images
                val_input=val_input.to(device)  
                val_labels=val_labels.to(device)  
                val_outputs=model(val_input)  
                 # Calculate the total categorical cross entropy loss
                val_loss1=criteron(val_outputs,val_labels)   
                _,val_preds=torch.max(val_outputs,1)  
                # Calculate the validation loss and accuracy
                valloss+=val_loss1.item()  
                valcorrect+=torch.sum(val_preds==val_labels.data)  
        epoch_loss=loss/len(trainloader)  
        epoch_acc=correct.float()/len(trainloader)  
        losshistory.append(epoch_loss)  
        correcthistory.append(epoch_acc)  
        # Calculate the validation epoch loss
        valepoch_loss=valloss/len(valloader)  
        valepoch_acc=valcorrect.float()/len(valloader)  
        valloss_history.append(valepoch_loss)  
        valcorrect_history.append(valepoch_acc)  
        # print validation loss and validation accuracy
        print('training_loss:{:.4f},{:.4f}'.format(epoch_loss,epoch_acc.item()))  
        print('validation_loss:{:.4f},{:.4f}'.format(valepoch_loss,valepoch_acc.item())) 

Output:

After running the above code we get the following output in which we can see that the training loss and validation loss and the CIFAR dataset images are printed on the screen.

PyTorch hyperparameter tuning
PyTorch hyperparameter tuning

So, with this, we understood the PyTorch hyperparameter tunning.

Read: PyTorch Linear Regression

PyTorch lightning hyperparameter tuning

In this section, we will learn about the PyTorch lightning hyperparameter tuning in python.

PyTorch lightning is a light weighted and high-performance framework. It makes artificial intelligence fast to iterate on.

PyTorch hyperparameter tunning can build the difference between the average model or highly accurate model.

Code:

In the following code, we will import all the necessary libraries such as import torch, import functional as F, and import MNISTDataModule from pl_bolts.datamodules, import os.

  • class LightningClassifier(pl.LightningModule): Here we are defining a model class with the help of the init() and forward() methods.
  • def configoptimizers(self): is used to define config optimizers.
  • def trainstep(self, trainingbatch, batch_index): is used to define the training steps.
  • def validstep(self, validationbatch, batch_index): is used to define the validation steps.
  • MNISTDataModule(data_dir=data_dir).prepare_data() is used to download the data.
  • print(analysis.best_config) is used to print the analysis best config by using the print() function.
# Importing Libraries
import torch
from torch.nn import functional as F
import pytorch_lightning as pl
from pl_bolts.datamodules import MNISTDataModule
import os
from ray.tune.integration.pytorch_lightning import TuneReportCallback

# Define a model class
class LightningClassifier(pl.LightningModule):
    def __init__(self, config, data_dir=None):
        super(LightningClassifier, self).__init__()
        self.data_dir = data_dir or os.getcwd()
        self.lr = config["lr"]
        layer, layer_1 = config["layer"], config["layer_1"]
        # mnist images are (1, 28, 28) (channels, width, height)
        self.layer = torch.nn.Linear(28 * 28, layer)
        self.layer_1 = torch.nn.Linear(layer, layer_1)
        self.layer_2 = torch.nn.Linear(layer_1, 12)
        self.accuracy = pl.metrics.Accuracy()

    def forward(self, m):
        batchsize, channels, width, height = m.size()
        m = m.view(batchsize, -1)
        m = self.layer(m)
        m = torch.relu(m)
        m = self.layer_1(m)
        m = torch.relu(m)
        m = self.layer_2(m)
        m = torch.log_softmax(m, dim=1)
        return m

    def configoptimizers(self):
        return torch.optim.Adam(self.parameters(), lr=self.lr)

    def trainstep(self, trainingbatch, batch_index):
        m, n = trainingbatch
        logits = self.forward(m)
        loss = F.nll_loss(logits, n)
        acc = self.accuracy(logits, n)
        self.log("ptl/train_loss", loss)
        self.log("ptl/train_accuracy", acc)
        return loss

    def validstep(self, validationbatch, batch_index):
        m, n = validationbatch
        logits = self.forward(m)
        loss = F.nll_loss(logits, n)
        acc = self.accuracy(logits, n)
        return {"val_loss": loss, "val_accuracy": acc}

    def validationepoch_end(self, outputs):
        avg_loss = torch.stack(
            [m["val_loss"] for m in outputs]).mean()
        avg_acc = torch.stack(
            [m["val_accuracy"] for m in outputs]).mean()
        self.log("ptl/val_loss", avg_loss)
        self.log("ptl/val_accuracy", avg_acc)
def training_mnist(config, data_dir=None, num_epochs=10, num_gpus=0):
    model = LightningClassifier(config, data_dir)
    dl = MNISTDataModule(
        data_dir=data_dir, num_workers=1, batch_size=config["batch_size"])
    metrices = {"loss": "ptl/val_loss", "acc": "ptl/val_accuracy"}
    trainer = pl.Trainer(
        max_epochs=num_epochs,
        gpus=num_gpus,
        progress_bar_refresh_rate=0,
        callbacks=[TuneReportCallback(metrices, on="validation_end")])
    trainer.fit(model, dl)

import tempfile
from ray import tune

numsamples = 12
numepochs = 12
# set this to higher if using GPU
gpusper_trial = 0 

data_dir = os.path.join(tempfile.gettempdir(), "mnist_data_")
# Download data
MNISTDataModule(data_dir=data_dir).prepare_data()

config = {
    "layer": tune.choice([34, 66, 130]),
    "layer_1": tune.choice([66, 130, 260]),
    "lr": tune.loguniform(1e-4, 1e-1),
    "batch_size": tune.choice([34, 66, 130]),
}

trainable = tune.with_parameters(
    training_mnist,
    data_dir=data_dir,
    num_epochs=numepochs,
    num_gpus=gpusper_trial)

analysis = tune.run(
    trainable,
    resources_per_trial={
        "cpu": 1,
        "gpu": gpusper_trial
    },
    metric="loss",
    mode="min",
    config=config,
    num_samples=numsamples,
    name="tune_mnist")

print(analysis.best_config)

Output:

After running the above code we get the following output in which we can see that the Pytorch lightning hyperparameter tuning batch size, layer, layer1, and learning rate values are printed on the screen.

PyTorch lightning hyperparameter tuning
PyTorch hyperparameter tuning lightning

This is how the PyTorch lightning hyperparameter tunning makes artificial intelligence fast to iterate on.

Read: PyTorch Activation Function

PyTorch geometric hyperparameter tuning

In this section, we will learn about the PyTorch geometric hyperparameter tuning in python.

The PyTorch geometric hyperparameter tuning is defined as a parameter that passes as an argument to the constructor of the estimator classes.

Code:

In the following code, we will import all the necessary libraries such as import torch, import torchvision, import transforms from torchvision.

  • transform = transforms.Compose([transforms.ToTensor()]): Here we are using conpose() method of transforms which will allow us to chain multiple transforms together which is passed as a first argument of composed, will transform.ToTensor. 
  • trainset = torchvision.datasets.FashionMNIST(root=’./data’, train=True, download=True, transform=transform) is used to loading the dataset.
  • class NN(nn.Module): is used to define the model class by using init() and forward() methods.
  • training_size, validation_size = 48000, 12000 is used describe the training size and validation size.
  • self.conv = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1) is used as input 1 channel with 32 filters each with kernel size 3.
  • self.fc = nn.Linear(128*3*3,32) is used as a fully connected layer.
  • print(‘training_size: ‘, training_size) is used to print the training size.
  • print(‘validation_size: ‘, validation_size) is used to print the validation size.
  • training_loader = DataLoader(training, batch_size=100, shuffle=False) is used to train the dataloaders.
  • criterion = nn.CrossEntropyLoss() is used to define loss.
  • optimizer = optim.Adam(net.parameters(), lr=0.001) is used to initialize the optimizer.
  • plot.figure(figsize=(16,6)) is used to plot the figure.
  • plot.show() is used to display the figure on the screen.
  • print(f’Test Accuracy: {correct/total:.3f} \n’ ) is used to print the test accuracy.
  • classes = (‘T-Shirt’, ‘Dress’, ‘Pullover’, ‘Trouser’, ‘Coat’, ‘Bag’, ‘Shirt’, ‘Sneaker’, ‘Sandal’, ‘Ankle Boot’) : Here we are declaring the list of classes.
  • print(‘Accuracy of %5s : %2d %%’ % ( classes[n], 100 * class_correct[n] / class_total[n])) is used to print the accuracy using print() function.
# Importing libraries
import torch
import torchvision
from torchvision import transforms

torch.manual_seed(0)

# Using compose() method
transform = transforms.Compose([transforms.ToTensor()])

# Load dataset
trainset = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
testset = torchvision.datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)


from statistics import mean
import numpy as np
import matplotlib.pyplot as plot
import pandas as pd
import torch
import torchvision
from torchvision import transforms


import torch.nn as nn
import torch.nn.functional as fun
from torch.utils.data import DataLoader, TensorDataset

# Define the model class
class NN(nn.Module):
    def __init__(self):
        super(NN, self).__init__()

        # Input 1 channel with 32 filters each with kernel_size=3 
        self.conv = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.conv1 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)

        self.batch_norm = nn.BatchNorm2d(32)
        self.batch_norm1 = nn.BatchNorm2d(64)
        self.batch_norm2 = nn.BatchNorm2d(128)

        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        self.dropout25 = nn.Dropout2d(p=0.25)

        # Fully connected layer
        self.fc = nn.Linear(128*3*3,32)
        self.fc1 = nn.Linear(32, 10)

    def forward(self, y):
        y = self.pool(fun.relu(self.batch_norm(self.conv(y))))
        y = self.dropout25(y)
        y = self.pool(fun.relu(self.batch_norm1(self.conv1(y))))
        y = self.dropout25(y)
        y = self.pool(fun.relu(self.batch_norm2(self.conv2(y))))
        y = self.dropout25(y)

        y = y.view(y.size(0),-1)
        y = self.fc(y)
        y = self.fc1(y)

        return y

# Describe the variables
training_size, validation_size = 48000, 12000

scale_tensor = trainset.data / 255.0
scale_tensor = scale_tensor.view(scale_tensor.size(0), 1, 28, 28)
scale_trainset = TensorDataset(scale_tensor, trainset.targets)

training, validation = torch.utils.data.random_split(scale_trainset, [training_size, validation_size])

print('training_size: ', training_size)
print('validation_size: ', validation_size)

training_loader = DataLoader(training, batch_size=100, shuffle=False)
validation_loader = DataLoader(validation, batch_size=100, shuffle=False)

net = NN()

device = 'cuda'

net.to(device)

import torch.optim as optim

# Define Loss function
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

epochs = 30
batches = 0
training_losses = list()
validation_losses = list()
validation_acces = list()
batch_lst = list()

batches = 0
validation_loss = 0

# Train NN
 # loop over the dataset multiple times
for epoch in range(1,epochs+1): 

    running_loss = 0.0
    for n, data in enumerate(training_loader):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        if torch.cuda.is_available():
          inputs = inputs.cuda()
          labels = labels.cuda()

        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward() 
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        # print every 2000 mini-batches
        if n % 100 == 99:    

            net.eval()
            
            correct = 0


            for m, val_data in enumerate(validation_loader):
                valid_X, valid_y = val_data

                valid_X = valid_X.cuda()
                valid_y = valid_y.cuda()

                outputs = net(valid_X)

                v_loss = criterion(outputs, valid_y)
                validation_loss += v_loss.item()

                preds = outputs.data.max(1, keepdim=True)[1]

                correct += preds.eq(valid_y.view_as(preds)).cpu().sum().item()

            log = f"epoch: {epoch} {n+1} " \
                  f"train_loss: {running_loss / 100:.3f} " \
                  f"val_loss: {validation_loss / 100:.3f} " \
                  f"Val Acc: {correct/len(validation_loader.dataset):.3f}"

            training_losses.append(running_loss / 100)
            validation_losses.append(validation_loss / 100)
            validation_acces.append(correct/len(validation_loader.dataset))
            batches += 100
            batch_lst.append(batches)

            valid_loss = 0

            print(log)     

            running_loss = 0.0

            net.train()


print('Finished Training')

# Plot the figure
plot.figure(figsize=(16,6))
plot.plot(batch_lst, training_losses, '-o', label='Training loss')
plot.plot(batch_lst, valid_losses, '-o', label='Validation loss')
plot.legend()
plot.title('Learning curves')
plot.xlabel('Batches')
plot.ylabel('Loss')
plot.xticks(batch_lst,rotation = 90)
plot.tight_layout()

plot.savefig("result.png")

plot.show()

scale_tensr = testset.data / 255.0
scale_tensr = scale_tensr.view(scale_tensr.size(0), 1, 28, 28)
scale_testset = TensorDataset(scale_tensr, testset.targets)

testing_loader = DataLoader(scale_testset, batch_size=100, shuffle=False)

correct = 0
total = 0

net.eval()

for data in testing_loader:
    testing_X, testing_y = data
    
    testing_X = testing_X.cuda()
    testing_y = testing_y.cuda()

    outputs = net(testing_X.float())
    _, predicted = torch.max(outputs.data, 1)
    total += testing_y.size(0)
    correct += (predicted == testing_y).sum().item()

print(f'Test Accuracy: {correct/total:.3f} \n' )

classes = ('T-Shirt', 'Dress', 'Pullover', 'Trouser',
           'Coat', 'Bag', 'Shirt', 'Sneaker', 'Sandal', 'Ankle Boot')

class_correct = list(0. for n in range(10))
class_total = list(0. for n in range(10))
with torch.no_grad():
    for data in testing_loader:
        testing_X, testing_y = data
    
        testing_X = testing_X.cuda()
        testing_y = testing_y.cuda()
        
        _, predicted = torch.max(outputs, 1)
        d = (predicted == testing_y).squeeze()
        for n in range(100):
            label = testing_y[n]
            class_correct[label] += d[n].item()
            class_total[label] += 1


for n in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[n], 100 * class_correct[n] / class_total[n]))

Output:

After running the above code, we get the following output in which we can see that the PyTorch geometry hyperparameter tunning accuracy value is printed on the screen.

PyTorch geometry hyperparameter tunning
PyTorch hyperparameter tuning geometry

So, with this, we understood how the PyTorch geometry hyperparameter tunning works.

You may also like to read the following Python PyTorch tutorials.

So, in this tutorial, we discussed PyTorch Hyperparameter tuning and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • PyTorch hyperparameter tuning
  • PyTorch lightning hyperparameter tuning
  • PyTorch geometric hyperparameter tuning