PyTorch Early Stopping + Examples

In this Python tutorial, we will learn about PyTorch early stopping in Python and we will also cover different examples related to Early Stopping. Additionally, we will cover these topics.

  • PyTorch early stopping
  • PyTorch early stopping example
  • PyTorch early stopping scheduler
  • PyTorch lightning early stopping
  • PyTorch ignite early stopping
  • PyTorch geometric early stopping
  • PyTorch lstm early stopping
  • PyTorch early stopping callback
  • PyTorch validation early stopping

PyTorch early stopping

In this section, we will learn about the PyTorch early stopping in python.

Early stopping is defined as a process to avoid overfitting on the training dataset and it hold on the track of validation loss. And here we will discuss how to use the Early Stopping process with the help of PyTorch.

Syntax:

The following syntax of early stopping:

Early Stopping(monitor=None, min_delta=0.0,patience = 3, verbose = False, mode = min, strict = True, check_finite = True, stopping_threshold = None, divergence_threshold = None, check_on_train_epoch_end = None)

Parameters:

  • monitor is used to be monitor the quantity.
  • min_delta is used to very small change in the monitored quantity to qualify as an improvement.
  • patience is used to check number of time with no improvement after which training will be stopped.
  • verbose is used as an verbosity mode.
  • mode: there are two type of mode min and max.
    • min mode is used while training will stopped when the quantity monitor has stopped decreasing.
    • max mode is used while training will stopped when the quantity monitor has stopped increasing.
  • strict is used anyway to crash training if monitor is not found in validation metrics.
  • check_finite is used to stop training if the monitor become Nan.
  • stopping_threshold is used when the monitor quantity reaches this threshold, stop training immediately.
  • divergence_threshold is used to stop training as soon as the monitor quantity become poor than this threshold.
  • check_on_train_epoch_end is used to run early stopping at the end of the training epoch if it is false check at the end of the validation.

Also, check: Adam optimizer PyTorch with Examples

PyTorch early stopping example

In this section, we will learn about the implementation of early stopping with the help of an example in python.

PyTorch early stopping is defined as a process from which we can prevent the neural network from overfitting while training the data.

Code:

In the following code, we will import some libraries from which we can train the data and implement early stopping on the data.

  • def traindata(device, model, epochs, optimizer, loss_function, train_loader, valid_loader): is used to define the train data.
  • optimizer.zero_grad() is used to optimize the zero gradient.
  • output = model(input.view(input.shape[0], -1)) is used get the output of the model.
  • loss = loss_function(output, label) is used to calculate the loss.
  • print(‘[{}/{}, {}/{}] loss: {:.8}’.format(epoch, epochs, times, len(train_loader), loss.item()) is print the progress of the model.
  • current_loss = validation(model, device, valid_loader, loss_function) is used to calculate the current loss.
  • print(‘The Current Loss:’, current_loss) is used to print the current loss.
  • print(‘Accuracy:’, correct / total) is used to print the accuracy of the model.
  • loss_function = nn.NLLLoss() is used to calculate the loss function.
  • optimizer = optim.Adam(model.parameters(), lr=lr) is used to optimize the optimizer.
  • transform = transforms.Compose() is used to transform the data.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data as data
from torchvision import datasets, transforms



# Model architecture
class model(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(in_features=786, out_features=130),
            nn.ReLU(),
            nn.Linear(in_features=130, out_features=66),
            nn.ReLU(),
            nn.Linear(in_features=66, out_features=12),
            nn.LogSoftmax(dim=1)
        )

    def forward(self, input):
        return self.main(input)


# Train
def traindata(device, model, epochs, optimizer, loss_function, train_loader, valid_loader):
    # Early stopping
    last_loss = 100
    patience = 2
    triggertimes = 0

    for epoch in range(1, epochs+1):
        model.train()

        for times, data in enumerate(train_loader, 1):
            input = data[0].to(device)
            label = data[1].to(device)

            # Zero the gradients
            optimizer.zero_grad()

            # Forward and backward propagation
            output = model(input.view(input.shape[0], -1))
            loss = loss_function(output, label)
            loss.backward()
            optimizer.step()

            # Show progress
            if times % 100 == 0 or times == len(train_loader):
                print('[{}/{}, {}/{}] loss: {:.8}'.format(epoch, epochs, times, len(train_loader), loss.item()))

        # Early stopping
        current_loss = validation(model, device, valid_loader, loss_function)
        print('The Current Loss:', current_loss)

        if current_loss > last_loss:
            trigger_times += 1
            print('Trigger Times:', trigger_times)

            if trigger_times >= patience:
                print('Early stopping!\nStart to test process.')
                return model

        else:
            print('trigger times: 0')
            trigger_times = 0

        last_loss = current_loss

    return model


def validation(model, device, valid_loader, loss_function):

    model.eval()
    loss_total = 0

    # Test validation data
    with torch.no_grad():
        for data in valid_loader:
            input = data[0].to(device)
            label = data[1].to(device)

            output = model(input.view(input.shape[0], -1))
            loss = loss_function(output, label)
            loss_total += loss.item()

    return loss_total / len(valid_loader)


def test(device, model, test_loader):

    model.eval()
    total = 0
    correct = 0

    with torch.no_grad():
        for data in test_loader:
            input = data[0].to(device)
            label = data[1].to(device)

            output = model(input.view(input.shape[0], -1))
            _, predicted = torch.max(output.data, 1)

            total += label.size(0)
            correct += (predicted == label).sum().item()

    print('Accuracy:', correct / total)


def main():
    # GPU device
    device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
    print('Device state:', device)

    epochs = 100
    batch_size = 66
    lr = 0.004
    loss_function = nn.NLLLoss()
    model = Net().to(device)
    optimizer = optim.Adam(model.parameters(), lr=lr)

    # Transform
    transform = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize((0.5,), (0.5,))]
    )

    # Data
    trainset = datasets.MNIST(root='MNIST', download=True, train=True, transform=transform)
    testset = datasets.MNIST(root='MNIST', download=True, train=False, transform=transform)
   
    trainset_size = int(len(trainset) * 0.8)
    validset_size = len(trainset) - trainset_size
    trainset, validset = data.random_split(trainset, [trainset_size, validset_size])

    trainloader = data.DataLoader(trainset, batch_size=batch_size, shuffle=True)
    testloader = data.DataLoader(testset, batch_size=batch_size, shuffle=False)
    validloader = data.DataLoader(validset, batch_size=batch_size, shuffle=True)

    # Train
    model = traindata(device, model, epochs, optimizer, loss_function, trainloader, validloader)

    # Test
    test(device, model, testloader)


if __name__ == '__main__':
    main()

Output:

After running the above code, we get the following output in which we can see that the early stopping is applied on the train data.

PyTorch early stopping example
PyTorch early stopping example

Read: Cross Entropy Loss PyTorch

PyTorch early stopping scheduler

In this section, we will learn about the PyTorch early stopping scheduler works in python.

  • PyTorch early stopping is used to prevent the neural network from overfitting while training the data.
  • Early stopping scheduler hold on the track of the validation loss if the loss stop decreases for some epochs the training stop.

Code:

In the following code, we will import some libraries from which we can train and validate the data using early stopping scheduling.

  • traindataset = pds.read_csv(‘train.csv’,dtype = np.float32) is used to load the train dataset.
  • testdataset = pds.read_csv(‘test.csv’,dtype = np.float32) is used to load the test dataset.
  • featuresnumpy = traindataset.loc[:,train_dataset.columns != “label”].values/255 is used to mormalize the data.
  • featurestrain, featurestest, targetstrain, targetstest = train_test_split(featuresnumpy, targetsnumpy,test_size = 0.2,random_state = 42) is used to split the train and test dataset.
  • train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size,shuffle=True) is used to load the train data.
  • self.layers_dim = layers_dim is used to get the number of hidden layer.
  • error = nn.CrossEntropyLoss() is used to calculate the cross entropy loss.
  • images = images.view(-1, seq_dim, inputs_dim) is used to resize the images.
  • predicted = torch.max(outputs.data, 1) is used to get prediction for the maximum values.
  • total += labels.size(0) is used to get total number of labels.
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
import argparse
import numpy as num
import pandas as pds
from sklearn.model_selection import train_test_split
from torch.autograd import Variable

traindataset = pds.read_csv('train.csv',dtype = np.float32)

testdataset = pds.read_csv('test.csv',dtype = np.float32)

targetsnumpy = traindataset.label.values
featuresnumpy = traindataset.loc[:,train_dataset.columns != "label"].values/255 


featurestrain, featurestest, targetstrain, targetstest = train_test_split(featuresnumpy,
                                                                             targetsnumpy,
                                                                             test_size = 0.2,
                                                                             random_state = 42) 

# create feature and targets tensor for train set
featuresTrain = torch.from_numpy(featurestrain)
targetsTrain = torch.from_numpy(targetstrain).type(torch.LongTensor) 


# create feature and targets tensor for test set.
featuresTest = torch.from_numpy(featurestest)
targetsTest = torch.from_numpy(targetstest).type(torch.LongTensor) 

# batch_size, epoch and iteration
batch_size = 100
n_iters = 10000
num_epochs = n_iters / (len(featurestrain) / batch_size)
num_epochs = int(num_epochs)

# Pytorch train and test sets
train = torch.utils.data.TensorDataset(featuresTrain,targetsTrain)
test = torch.utils.data.TensorDataset(featuresTest,targetsTest)

# data loader
train_loader = torch.utils.data.DataLoader(train, batch_size = batch_size, shuffle = False)
test_loader = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = False)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

# Create RNN Model

class RnnModel(nn.Module):
    def __init__(self, inputs_dim, hidden_dim, layers_dim, outputs_dim):
        super(RnnModel, self).__init__()
        # Hidden dimensions
        self.hidden_dim = hidden_dim
        self.layers_dim = layers_dim

        self.rnn = nn.RNN(inputs_dim, hidden_dim, layers_dim, batch_first=True, nonlinearity='tanh')

        self.fc = nn.Linear(hidden_dim, outputs_dim)

    def forward(self, X):
        h = torch.zeros(self.layers_dim, x.size(0), self.hidden_dim).requires_grad_()
        outs, hn = self.rnn(X, h.detach())
        outs = self.fc(outs[:, -1, :]) 
        return outs
# batch_size, epoch and iteration
batchsize = 100
niters = 3000
num_epochs = niters / (len(featurestrain) / batchsize)
num_epochs = int(num_epochs)

train = torch.utils.data.TensorDataset(featuresTrain,targetsTrain)
test = torch.utils.data.TensorDataset(featuresTest,targetsTest)

# data loader
train_loader = torch.utils.data.DataLoader(train, batch_size = batch_size, shuffle = False)
test_loader = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = False)
    
# Create RNN
inputs_dim = 28  
hidden_dim = 100 
layers_dim = 5     
outputs_dim = 12   

models = RnnModel(inputs_dim, hidden_dim, layers_dim, outputs_dim) 
error = nn.CrossEntropyLoss()

# SGD Optimizer
learning_rate = 0.07
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

seq_dim = 28  
loss_list = []
iteration_list = []
accuracy_list = []
count = 0
min_val_loss = num.Inf
val_array = []
correct = 0
iter = 0
count = 0
iter_array = []
loss_array = []
total = 0
accuracy_array = []
n_epochs_stop = 6
epochs_no_improve = 0
early_stop = False
for epoch in range(num_epochs):
    val_loss = 0
    for i, (images, labels) in enumerate(train_loader):

        train  = Variable(images.view(-1, seq_dim, inputs_dim))
        labels = Variable(labels )
            
        # Clear gradients
        optimizer.zero_grad()
        
        # Forward propagation
        outputs = model(train)
        
        loss = error(outputs, labels)
        
        # Calculating gradients
        loss.backward()
        
        # Update parameters
        optimizer.step()
        val_loss += loss
        val_loss = val_loss / len(train_loader)
        # If the validation loss is at a minimum
        if val_loss < min_val_loss
             epochs_no_improve = 0
             min_val_loss = val_loss
  
        else:
            epochs_no_improve += 1
        iter += 1
        if epoch > 5 and epochs_no_improve == n_epochs_stop:
            print('Early stopping!' )
            early_stop = True
            break
        else:
            continue
        break
        if iter % 336 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
  # Check early stopping condition   
    if early_stop:
        print("Stopped")
        break
    for images, labels in test_loader:
      
        images = images.view(-1, seq_dim, inputs_dim)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)

                
        total += labels.size(0)

        correct += (predicted == labels).sum()

        accuracy = 100 * correct / total
        
        #Print Loss
        count = count +1
        if iter % 336 == 0 and count % 100 == 0  : 
            iter_array.append(iter)
            loss_array.append(loss.item())
            accuracy_array.append(accuracy.item())
            print('Epoch: {}. Iteration: {}. loss: {}. accuracy: {}, count: {}'.format(epoch,iter, loss.item(),accuracy.item(),count))

Output:

In the following output, we can see that the early stopping mechanism prevents the neural network from overfitting.

PyTorch early stopping scheduler
PyTorch early stopping scheduler

Read: PyTorch Save Model – Complete Guide

PyTorch lightning early stopping

In this section, we will learn about how the PyTorch lightning early stopping in python.

PyTorch lightning early stopping is used to stop an epoch early for avoiding overfitting on the training dataset.

Code:

In the following code, we will import some libraries from which we can stop the epoch early to avoid overfitting.

  • torch.nn.Linear() is used to create the feed-forward network with the help of input and outputs.
  • torch.relu() is used as an activation function.
  • F.cross_entropy() is used to calculate the difference between two probability distribution.
  • traindataset = MNIST(PATH_DATASETS, train=True, download=True, transform=transforms.ToTensor()) is used to create the train dataset.
  • trainloader = DataLoader(traindataset, batch_size=BATCHSIZE) is used to load the train data.
  • trainer.fit(mnistmodel, train_loader) is used to fit the train data.
import os

import torch
from pytorch_lightning import LightningModule, Trainer
from torch import nn
from torch.nn import functional as f
from torch.utils.data import DataLoader, random_split
from torchmetrics import Accuracy
from torchvision import transforms
from torchvision.datasets import MNIST

PATHDATASET = os.environ.get("PATH_DATASETS", ".")
AVAILGPUS = min(1, torch.cuda.device_count())
BATCHSIZE = 250 if AVAILGPUS else 60
class MNISTModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(28 * 28, 10)

    def forward(self, X):
        return torch.relu(self.l1(X.view(X.size(0), -1)))

    def training_step(self, batch, batch_nb):
        X, y = batch
        loss = F.cross_entropy(self(X), y)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)
# Init our model
mnistmodel = MNISTModel()

# Init DataLoader from MNIST Dataset
traindataset = MNIST(PATH_DATASETS, train=True, download=True, transform=transforms.ToTensor())
trainloader = DataLoader(traindataset, batch_size=BATCHSIZE)

# Initialize a trainer
trainer = Trainer(
    gpus=AVAILGPUS,
    max_epochs=3,
    progress_bar_refresh_rate=20,
)

# Train the model 
trainer.fit(mnistmodel, train_loader)

Output:

After running the above code, we get the following output in which we can see that the epoch is stopped earlier to avoid overfitting on the training dataset.

PyTorch lightening early stopping
PyTorch lightening early stopping

Read: PyTorch nn linear + Examples

PyTorch ignite early stopping

In this section, we will learn about the PyTorch ignite early stopping in python.

Early stopping ignite is defined as a process that can be used to stop the training after a given number of events when no improvement is shown.

syntax:

The following syntax is of ignite early stopping:

ignite.handler.early_stopping.EarlyStopping(patience,score_function,trainer,min_delta=0.0,cumulative_delta = False)

Parameters:

  • patience is used to wait the number of event if no improvement is shown then stop the training.
  • score_function is used as a function which take a single argument and return a score float.
  • trainer is used to stop the run when no improvement is done.
  • min_delta is the minimum increase in the score to qualify as an improvement.
  • cumulative_delta if the value is true the min_delta defines as an increase since the last patience reset otherwise it increase after the last event and the default value of cumulative delta is False.

PyTorch geometric early stopping

In this section, we will learn about how PyTorch geometric early stopping works in python.

  • PyTorch geometric early stopping is defined as a process that stops epoch early. Early stopping based on metric using EarlyStopping Callback.
  • Geometric is related to the method that is used by early stopping.

Code:

In the following code, we will import some libraries from which the early stopping stops the epoch early.

  • (trainimage, trainlabel), (testimage, testlabel)= mnist.load_data() is used to load the data.
  • trainimage = trainimage.astype(‘float32’)/255 is used to scaling down pixel value of train image.
  • ytrain = to_categorical(trainlabel) is used to encoding labels to a binary class labels.
  • earlystopping = callbacks.EarlyStopping(monitor =”val_loss”, mode =”min”, patience = 7, restore_best_weights = True) is used to stop the epoch early.
  • models.fit() is used to fit the model.
import keras
from keras.utils.np_utils import to_categorical
from keras.datasets import mnist
  

(trainimage, trainlabel), (testimage, testlabel)= mnist.load_data()
  
# Reshaping data-Adding number of channels 
trainimage = trainimage.reshape((trainimage.shape[0], 
                                     trainimage.shape[1], 
                                     trainimage.shape[2], 1))
  
testimage = testimage.reshape((testimage.shape[0], 
                                   testimage.shape[1],
                                   testimage.shape[2], 1))
  

trainimage = trainimage.astype('float32')/255
testimage = testimage.astype('float32')/255
 
ytrain = to_categorical(trainlabel)
ytest = to_categorical(testlabel)

from keras import models
from keras import layers
  
models = models.Sequential()
models.add(layers.Conv2D(32, (3, 3), activation ="relu", 
                             input_shape =(28, 28, 1)))
models.add(layers.MaxPooling2D(2, 2))
models.add(layers.Conv2D(64, (3, 3), activation ="relu"))
models.add(layers.MaxPooling2D(2, 2))
models.add(layers.Flatten())
models.add(layers.Dense(64, activation ="relu"))
models.add(layers.Dense(10, activation ="softmax"))
models.compile(optimizer ="rmsprop", loss ="categorical_crossentropy",
                                             metrics =['accuracy'])
valimage = trainimage[:10000]
partialimage = trainimage[10000:]
vallabel = ytrain[:10000]
partiallabel = ytrain[10000:]

from keras import callbacks
earlystopping = callbacks.EarlyStopping(monitor ="val_loss", 
                                        mode ="min", patience = 7, 
                                        restore_best_weights = True)
  
history = models.fit(partialimage, partiallabel, batch_size = 130, 
                    epochs = 23, validation_data =(valimage, vallabel), 
                    callbacks =[earlystopping])

Output:

In the following output, we can see that the epoch is stopped earlier with the help of early stopping.

PyTorch geometric early stopping
PyTorch geometric early stopping

PyTorch lstm early stopping

In this section, we will learn about the PyTorch lstm early stopping in python.

LSTM stands for long short term memory and it is an artificial neural network architecture that is used in the area of deep learning.

Code:

In the following code, we will import some libraries from which we can apply early stopping.

  • nn.Sequential() is used to run certain layer sequentially.
  • nn.Linear() is used to create the feed-forward network.
  • def train(device, model, epochs, optimizer, loss_function, train_loader, valid_loader): is used to define the train data.
  • optimizer.zero_grad() is used to optimize the the zero gradient.
  • current_loss = validation(model, device, valid_loader, loss_function) is used to calculate the current loss.
  • print(‘The Current Loss:’, current_loss) is used to print the current loss.
  • transformdata = transforms.Compose() is used to transform the data.
  • trainloader = data.DataLoader(trainset, batch_size=batchsize, shuffle=True) is used to load the train data.
  • testloader = data.DataLoader(testset, batch_size=batchsize, shuffle=False) is used to load the test data.
import torch
import torch.nn as nn
import torch.optim as optimize
import torch.utils.data as data
from torchvision import datasets, transforms


class modelarc(nn.Module):
    def __init__(self):
        super(modelarc, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(in_features=784, out_features=128),
            nn.ReLU(),
            nn.Linear(in_features=128, out_features=64),
            nn.ReLU(),
            nn.Linear(in_features=64, out_features=10),
            nn.LogSoftmax(dim=1)
        )

    def forward(self, Input):
        return self.main(Input)


def train(device, model, epochs, optimizer, loss_function, train_loader, valid_loader):

    the_last_loss = 100
    patience = 2
    trigger_times = 0

    for epoch in range(1, epochs+1):
        model.train()

        for times, data in enumerate(train_loader, 1):
            inputs = data[0].to(device)
            labels = data[1].to(device)


            optimizer.zero_grad()


            outputs = model(inputs.view(inputs.shape[0], -1))
            loss = loss_function(outputs, labels)
            loss.backward()
            optimizer.step()

            if times % 100 == 0 or times == len(train_loader):
                print('[{}/{}, {}/{}] loss: {:.6}'.format(epoch, epochs, times, len(train_loader), loss.item()))

        current_loss = validation(model, device, valid_loader, loss_function)
        print('The Current Loss:', current_loss)

        if current_loss > the_last_loss:
            trigger_times += 1
            print('Trigger Times:', trigger_times)

            if trigger_times >= patience:
                print('Early Stopping!\nStart to test process.')
                return model

        else:
            print('Trigger Times: 0')
            trigger_times = 0

        the_last_loss = current_loss

    return model


def validation(model, device, valid_loader, loss_function):
    model.eval()
    totalloss = 0

    with torch.no_grad():
        for data in valid_loader:
            input = data[0].to(device)
            label = data[1].to(device)

            outputs = model(input.view(input.shape[0], -1))
            loss = loss_function(outputs, label)
            totalloss += loss.item()

    return totalloss / len(valid_loader)


def test(device, model, test_loader):

    model.eval()
    total = 0
    correct = 0

    with torch.no_grad():
        for data in test_loader:
            input = data[0].to(device)
            label = data[1].to(device)

            outputs = model(input.view(input.shape[0], -1))
            _, predicted = torch.max(outputs.data, 1)

            total += label.size(0)
            correct += (predicted == label).sum().item()

    print('ModelAccuracy:', correct / total)


def main():

    device = 'cpu'
    print('Device state:', device)

    epochs = 60
    batchsize = 44
    lr = 0.002
    loss_function = nn.NLLLoss()
    model = modelarc().to(device)
    optimizer = optimize.Adam(model.parameters(), lr=lr)

    transformdata = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize((0.5,), (0.5,))]
    )


    trainset = datasets.MNIST(root='MNIST', download=True, train=True, transform=transformdata)
    testset = datasets.MNIST(root='MNIST', download=True, train=False, transform=transformdata)
   
    trainsetsize = int(len(trainset) * 0.6)
    validsetsize = len(trainset) - trainsetsize
    trainset, validset = data.random_split(trainset, [trainsetsize, validsetsize])

    trainloader = data.DataLoader(trainset, batch_size=batchsize, shuffle=True)
    testloader = data.DataLoader(testset, batch_size=batchsize, shuffle=False)
    validloader = data.DataLoader(validset, batch_size=batchsize, shuffle=True)


    model = train(device, model, epochs, optimizer, loss_function, trainloader, validloader)

    test(device, model, testloader)


if __name__ == '__main__':
    main()

Output:

After running the above code, we get the following output in which we can see that the early stopping is applied to avoid overfitting.

PyTorch lstm early stopping
PyTorch lstm early stopping

PyTorch early stopping callback

In this section, we will learn about how the PyTorch early stopping callback works in python.

Callback defines as a process that allows us to identify the performance measured to monitor in order to end the training.

syntax:

The following syntax is of PyTorch callback:

pytorch_lightning.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=0, verbose=0, mode='auto', baseline=None, restore_best_weights=False)

Parameters:

  • monitor is used to assign the value that can be monitored by the function. The value can be validation loss or validation accuracy.
  • mode is used to change the quantity monitor should be observed.
  • min_delta is to set the minimum value for some change.
  • patience is defined as the number of epoch for the training.
  • verbose verbose is defined as an integer value.
  • restore_best_weights if the value is true it restores the weight .The default value of restore_best_weight is false.

PyTorch validation early stopping

In this section, we will learn about the PyTorch validation early stopping in python.

Early stopping is defined as a process to avoid overfitting on the training dataset and also keeps track of validation loss.

Code:

In the following code, we will import some libraries from which we can validate early stopping.

  • (trainimages, trainlabels), (testimages, testlabels)= mnist.load_data() is used to load the data.
  • trainimages=trainimages.reshape((trainimages.shape[0],trainimages.shape[1],trainimages.shape[2], 1)) is used to reshapping data- adding number of channels.
  • trainimages = trainimages.astype(‘float32’)/255 is used to scaling down pixel value of train images.
  • testimages = testimages.astype(‘float32’)/255 is used to scaling down pixel images of test images.
  • earlystopping = callbacks.EarlyStopping() is used to callback early stopping to avoid overfitting.
  • data = models.fit() is used to fit the model.
import keras
from keras.utils.np_utils import to_categorical
from keras.datasets import mnist
  

(trainimages, trainlabels), (testimages, testlabels)= mnist.load_data()

trainimages = trainimages.reshape((trainimages.shape[0], 
                                     trainimages.shape[1], 
                                     trainimages.shape[2], 1))
  
testimages = testimages.reshape((testimages.shape[0], 
                                   testimages.shape[1],
                                   testimages.shape[2], 1))
  

trainimages = trainimages.astype('float32')/255
testimages = testimages.astype('float32')/255
  

ytrains = to_categorical(trainlabels)
ytests = to_categorical(testlabels)

from keras import models
from keras import layers
  
models = models.Sequential()
models.add(layers.Conv2D(30, (3, 3), activation ="relu", 
                             input_shape =(28, 28, 1)))
models.add(layers.MaxPooling2D(2, 2))
models.add(layers.Conv2D(66, (3, 3), activation ="relu"))
models.add(layers.MaxPooling2D(2, 2))
models.add(layers.Flatten())
models.add(layers.Dense(66, activation ="relu"))
models.add(layers.Dense(10, activation ="softmax"))
models.compile(optimizer ="rmsprop", loss ="categorical_crossentropy",
                                             metrics =['accuracy'])
validationimage = trainimages[:999]
partialimage = trainimages[999:]
validationlabel = ytrains[:999]
partiallabel = ytrains[999:]

from keras import callbacks
earlystopping = callbacks.EarlyStopping(monitor ="val_loss", 
                                        mode ="min", patience = 7, 
                                        restore_best_weights = True)
  
data = models.fit(partialimage, partiallabel, batch_size = 100, 
                    epochs = 15, validation_data =(validationimage, validationlabel), 
                    callbacks =[earlystopping])

Output:

After running the above code, we get the following output in which we can see that the early stopping can avoid the overfitting and keep the track of validation loss.

PyTorch Validation early stopping
PyTorch Validation early stopping

So, in this tutorial, we discussed PyTorch early stopping and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • PyTorch early stopping
  • PyTorch early stopping stopping example
  • PyTorch early stopping scheduler
  • PyTorch lightning early stopping
  • PyTorch ignite early stopping
  • PyTorch geometric early stopping
  • PyTorch lstm early stopping
  • PyTorch early stopping callback
  • PyTorch validation early stopping