PyTorch Save Model – Complete Guide

In this Python tutorial, we will learn about How to save the PyTorch model in Python and we will also cover different examples related to the saving model. Moreover, we will cover these topics.

  • PyTorch save model
  • PyTorch save model example
  • PyTorch save model checkpoint
  • PyTorch save model architecture
  • PyTorch save the model for inference
  • PyTorch save the model during training
  • PyTorch save the model to onnx

PyTorch save model

In this section, we will learn about how to save the PyTorch model in Python.

  • PyTorch save model is used to save the multiple components and also used to serialize the component in the dictionary with help of a torch.save() function.
  • The save function is used to check the model continuity how the model is persist after saving.

Code:

Before using the Pytorch save the model function, we want to install the torch module by the following command.

pip install torch

After installing the torch module also install the touch vision module with the help of this command.

pip install torchvision
PyTorch save model torchversion module
PyTorch save model torchversion module

After installing everything our code of the PyTorch saves model can be run smoothly.

  • torchmodel = model.vgg16(pretrained=True) is used to build the model.
  • torch.save(torchmodel.state_dict(), ‘torchmodel_weights.pth’) is used to save the PyTorch model.
  • state_dic() function is defined as a python dictionary that maps each layer to its parameter tensor.
import torch
import torchvision.models as model
torchmodel = model.vgg16(pretrained=True)
torch.save(torchmodel.state_dict(), 'torchmodel_weights.pth')
PyTorch save model path
PyTorch save model path

Also, check: Machine Learning using Python

PyTorch save model Example

In this section, we will learn about how to save the PyTorch model explain it with the help of an example in Python.

  • PyTorch save function is used to save multiple components and arrange all components into a dictionary.
  • The torch.save() function is used to save and arrange components into the dictionary.

Code:

In the following code, we will import some torch libraries to train a classifier by making the model and after making save it.

  • model = TheModelClass() is used to initialize the model.
  • optimizer = optimize.SGD(model.parameters(), lr=0.001, momentum=0.8) is used to initialize the optimizer.
  • print(“Model’s state_dict:”) is used to print the state_dict.
  • state_dict is defined as a python dictionary that maps each layer to its parameter tensor.
  • print(“Optimizer’s state_dict:”) is used to print the optimizer state_dict.
  • torch.save(model.state_dict(), ‘model_weights.pth’) is used to save the model.
import torch
import torch.nn as nn
import torch.optim as optimize 
# Define model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass, self).__init__()
        self.conv = nn.Conv2d(4, 7, 6)
        self.pool = nn.MaxPool2d(3, 3)
        self.conv1 = nn.Conv2d(7, 17, 6)
        self.fc1 = nn.Linear(17 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, X):
        X = self.pool(F.relu(self.conv(X)))
        X = self.pool(F.relu(self.conv1(X)))
        X = X.view(-1, 17 * 5 * 5)
        X = F.relu(self.fc1(X))
        X = F.relu(self.fc2(X))
        X = self.fc3(X)
        return X_train_gray


model = TheModelClass()


optimizer = optimize.SGD(model.parameters(), lr=0.001, momentum=0.8)


print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())


print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])
torch.save(model.state_dict(), 'model_weights.pth')

Output:

After running the above code, we get the following output in which we can see that we can train a classifier and after training save the model.

PyTorch save model
PyTorch save model

Read: PyTorch nn linear + Examples

PyTorch save model checkpoint

In this section, we will learn about how to save the PyTorch model checkpoint in Python.

  • PyTorch save model checkpoint is used to save the the multiple checkpoint with help of torch.save() function.
  • torch.save() function is also used to set the dictionary periodically.

Code:

In the following code, we will import the torch module from which we can save the model checkpoints.

  • nn.Conv2d() is used to define the neural network that produce ouput from the input.
  • self.pool() is define as a dictionary like structure which is used to store instances.
  • optimizer = optimize.SGD(net1.parameters(), lr=0.001, momentum=0.9) is used to initialize the optimizer.
  • torch.save() is used to save the PyTorch model.
import torch
import torch.nn as nn
import torch.optim as optimize
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv = nn.Conv2d(5, 8, 7)
        self.pool = nn.MaxPool2d(4, 4)
        self.conv1 = nn.Conv2d(8, 18, 7)
        self.fc = nn.Linear(18 * 7 * 7, 140)
        self.fc1 = nn.Linear(140, 86)
        self.fc2 = nn.Linear(86, 12)

    def forward(self, X):
        X = self.pool(F.relu(self.conv(X)))
        X = self.pool(F.relu(self.conv1(X)))
        X = X.view(-1, 18 * 7 * 7)
        X = F.relu(self.fc(X))
        X = F.relu(self.fc1(X))
        X = self.fc2(x)
        return X

net1 = Net()
print(net1)
optimizer = optimize.SGD(net1.parameters(), lr=0.001, momentum=0.9)

EPOCH = 5
PATH = "model.pt"
LOSS = 0.4

torch.save({
            'epoch': EPOCH,
            'model_state_dict': net1.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': LOSS,
            }, PATH)

Output:

After running the above code we get the following output in which we can see that the multiple checkpoints are printed on the screen after that the save() function is used to save the checkpoint model.

PyTorch save model checkpoint
PyTorch save model checkpoint

Read: Cross Entropy Loss PyTorch

PyTorch save model architecture

In this section, we will learn about how we can save PyTorch model architecture in python.

Pytorch save model architecture is defined as to design a structure in other we can say that a constructing a building.

Code:

In the following code, we will import some libraries which help to run the code and save the model.

  • use_cuda = torch.cuda.is_available() is used to the CUDA tensor type and the available() function is used to determine that the CUDA support our system or not.
  • torch.save(state, fpath) is used to save the model.
  • shutil.copyfile(fpath, bestfpath) is used to copy the file to the best part given.
import matplotlib.pyplot as plot
import torch
import shutil
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms
import numpy as num
use_cuda = torch.cuda.is_available()
def savemodel(state, is_best, checkpointpath, bestmodel_path):
  
    fpath = checkpointpath
    torch.save(state, fpath)
    # if it is a best model, min validation loss
    if is_best:
        bestfpath = bestmodel_path

        shutil.copyfile(fpath, bestfpath)

After saving the model we can load the model to check the best fit model.

  • modelcheckpoint = torch.load(checkpointfpath) is used to load the checkpoint.
  • model.load_state_dict(modelcheckpoint[‘state_dict’]) is used to initialize sate_dict from checkpoint to model.
  • optimizer.load_state_dict(modelcheckpoint[‘optimizer’]) is used to initialize the optimizer.
  • valid_loss_min = modelcheckpoint[‘valid_loss_min’] is used to initialize the valid_loss_min from checkpoint.
  • return model, optimizer, modelcheckpoint[‘epoch’], valid_loss_min.item() is used to return the model.
def load_model(checkpointfpath, model, optimizer):
   
    modelcheckpoint = torch.load(checkpointfpath)

    model.load_state_dict(modelcheckpoint['state_dict'])

    optimizer.load_state_dict(modelcheckpoint['optimizer'])
    valid_loss_min = modelcheckpoint['valid_loss_min']
 
    return model, optimizer, modelcheckpoint['epoch'], valid_loss_min.item()

After loading the model we want to import the data and also create the data loader.

  • transformdata=transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) is used to define a transform to normalize the data.
  • train_set = datasets.FashionMNIST(‘F_MNIST_data/’, download=True, train=True, transform=transformdata) is used to download and load the training data.
  • loaders = { ‘train’ : torch.utils.data.DataLoader(trainset,batch_size = 64,shuffle=True), ‘test’ : torch.utils.data.DataLoader(testset,batch_size = 64,shuffle=True), } is used to load the dataloader.

transformdata = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
# Download and load the training data
train_set = datasets.FashionMNIST('F_MNIST_data/', download=True, train=True, transform=transformdata)

test_set = datasets.FashionMNIST('F_MNIST_data/', download=True, train=False, transform=transformdata)

loaders = {
    'train' : torch.utils.data.DataLoader(trainset,batch_size = 64,shuffle=True),
    'test'  : torch.utils.data.DataLoader(testset,batch_size = 64,shuffle=True),
}

In the below code, we will define the function and create an architecture of the model.

  • model = fashion_Classifier() is used to create the network and define the optimizer.
  • model.cuda() is used when CUDA is available to move the model to GPU.
# Define your network 
class fashion_Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        input_size = 784
        self.fc = nn.Linear(input_size, 514)
        self.fc1 = nn.Linear(514, 258)
        self.fc2 = nn.Linear(258, 130)
        self.fc3 = nn.Linear(130, 66)
        self.fc4 = nn.Linear(66,12)
        self.dropout = nn.Dropout(p=0.2)
        
    def forward(self, X):
        X = X.view(X.shape[0], -1)
        X = self.dropout(F.relu(self.fc1(X)))
        X = self.dropout(F.relu(self.fc2(X)))
        X = self.dropout(F.relu(self.fc3(X)))
        X = self.dropout(F.relu(self.fc4(X)))
        X = F.log_softmax(self.fc5(X), dim=1)
        return X_train_prepared

model = fashion_Classifier()


if use_cuda:
    model = model.cuda()
    
print(model)
PyTorch save model architecture
PyTorch save model architecture

Read: Adam optimizer PyTorch with Examples

PyTorch save the model for inference

In this section, we will learn about PyTorch save the model for inference in python.

  • PyTorch saves the model for inference is defined as a conclusion that arrived at the evidence and reasoning.
  • torch.Save() function is used to save the model and arrange the component in the dictionary.

Code:

In the following code, we will import some libraries from which we can save the model inference.

  • optimizer = optimize.SGD(net.parameters(), lr=0.001, momentum=0.9) is used to initialise the optimizer.
  • PATH = “state_dict_model.pt” is used to specify the path.
  • torch.save(net.state_dict(), PATH) is used to save the model.
  • models.load_state_dict(torch.load(PATH)) is used to load the model.
  • PATH = “entire_model.pt” is used to specify the path.
import torch
import torch.nn as nn
import torch.optim as optimizer
import torch.nn.functional as f
class model(nn.Module):
    def __init__(self):
        super(model, self).__init__()
        self.conv = nn.Conv2d(5, 8, 7)
        self.pool = nn.MaxPool2d(4, 4)
        self.conv1 = nn.Conv2d(8, 18, 7)
        self.fc = nn.Linear(18 * 7 * 7, 140)
        self.fc1 = nn.Linear(140, 86)
        self.fc2 = nn.Linear(86, 12)

    def forward(self, X):
        X = self.pool(f.relu(self.conv(X)))
        X = self.pool(f.relu(self.conv1(X)))
        X = X.view(-3, 17 * 7 * 7)
        X = f.relu(self.fc(X))
        X = f.relu(self.fc1(X))
        X = self.fc2(X)
        return X

net = model()
print(net)
optimizer = optimize.SGD(net.parameters(), lr=0.001, momentum=0.9)
PATH = "state_dict_model.pt"
torch.save(net.state_dict(), PATH)

models = model()
models.load_state_dict(torch.load(PATH))
models.eval()
PATH = "entire_model.pt"
torch.save(net, PATH)
models = torch.load(PATH)
models.eval()

Output:

After running the above code, we get the following output in which we can see that model inference.

PyTorch save model for inference
PyTorch save the model for inference

Read: PyTorch Load Model + Examples

PyTorch save the model during training

In this section, we will learn about how we can save the PyTorch model during training in python.

The PyTorch model saves during training with the help of a torch.save() function after saving the function we can load the model and also train the model.

Code:

In the following code, we will import some libraries for training the model during training we can save the model.

  • torch.save(Cnn,PATH) is used to save the model.
  • bs, _, _, _ = X.shape is used to get the batch size.
  • X = f.adaptive_avg_pool2d(X, 1).reshape(bs, -1) is used to reshape the model.
  • traindata = datasets.CIFAR10() is used to train the model.
  • valdata = datasets.CIFAR10() is used to validate the data.
import torch.nn as nn
import torch.nn.functional as f
# model
class Cnn(nn.Module):
    def __init__(self):
        super(Cnn, self).__init__()
        self.conv = nn.Conv2d(in_channels=5, out_channels=66, kernel_size=7, padding=1)
        self.conv1 = nn.Conv2d(in_channels=66, out_channels=66, kernel_size=7, padding=1)
        self.conv2 = nn.Conv2d(in_channels=66, out_channels=130, kernel_size=7, padding=1)
        self.pool = nn.MaxPool2d(5, 4)
        self.dropout = nn.Dropout2d(0.7)
        self.fc = nn.Linear(in_features=130, out_features=1000)
        self.fc1 = nn.Linear(in_features=1000, out_features=10)
    def forward(self, X):
        X = f.relu(self.conv1(X))
        X = self.dropout(X)
        X = self.pool(X)
        X = f.relu(self.conv2(X))
        x = self.pool(X)
        x = f.relu(self.conv3(X))
        x = self.pool(X)
        bs, _, _, _ = X.shape
        X = f.adaptive_avg_pool2d(X, 1).reshape(bs, -1)
        X = F.relu(self.fc1(X))
        X = self.dropout(X)
        out = self.fc2(X)
        return out
torch.save(Cnn,PATH)
from torchvision import datasets
from torchvision.transforms import transforms
# define transforms
transformtrain = transforms.Compose([
            transforms.RandomCrop(34, padding=6),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize((0.4916, 0.4824, 0.4467), 
                                 (0.2025, 0.1996, 0.2012)),
        ])
transformval = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4916, 0.4824, 0.4467), 
                         (0.2025, 0.1996, 0.2012)),
])
# train and validation data
traindata = datasets.CIFAR10(
    root='../input/data',
    train=True,
    download=True,
    transform=transformtrain
)
valdata = datasets.CIFAR10(
    root='../input/data',
    train=False,
    download=True,
    transform=transformval
)

Output:

After running the above code, we get the following output in which we can see that training data is downloading on the screen.

PyTorch save model during training
PyTorch save model during training

Read: PyTorch Pretrained Model

PyTorch save the model to onnx

In this section, we will learn about how PyTorch save the model to onnx in Python.

  • ONNX is defined as an open neural network exchange it is also known as an open container format for the exchange of neural networks.
  • Here we convert a model covert model into ONNX format and run the model with ONNX runtime.

Code:

In the following code, we will import some libraries from which we can save the model to onnx.

  • torch_model = SuperResolutionmodel(upscale_factor=3) is used to create the super resolution model.
  • modelurl = ‘https://s3.amazonaws.com/pytorch/test_data/export/superres_epoch100-44c6958e.pth’ is used to load the model weight.
  • torch_model.load_state_dict(modelzoo.load_url(modelurl, map_location=maplocation)) is used to initializing the model with the pertaining weights.
  • torch_model.eval() is used to set the model.

import io
import numpy as num

from torch import nn
import torch.utils.model_zoo as modelzoo
import torch.onnx

import torch.nn as nn
import torch.nn.init as init


class SuperResolutionmodel(nn.Module):
    def __init__(self, upscale_factor, inplace=False):
        super(SuperResolutionmodel, self).__init__()

        self.relu = nn.ReLU(inplace=inplace)
        self.conv1 = nn.Conv2d(1, 64, (5, 5), (1, 1), (2, 2))
        self.conv2 = nn.Conv2d(64, 64, (3, 3), (1, 1), (1, 1))
        self.conv3 = nn.Conv2d(64, 32, (3, 3), (1, 1), (1, 1))
        self.conv4 = nn.Conv2d(32, upscale_factor ** 2, (3, 3), (1, 1), (1, 1))
        self.pixel_shuffle = nn.PixelShuffle(upscale_factor)

        self._initialize_weights()

    def forward(self, X):
        X = self.relu(self.conv1(X))
        X = self.relu(self.conv2(X))
        X = self.relu(self.conv3(X))
        X = self.pixel_shuffle(self.conv4(X))
        return X

    def _initialize_weights(self):
        init.orthogonal_(self.conv1.weight, init.calculate_gain('relu'))
        init.orthogonal_(self.conv2.weight, init.calculate_gain('relu'))
        init.orthogonal_(self.conv3.weight, init.calculate_gain('relu'))
        init.orthogonal_(self.conv4.weight)


torch_model = SuperResolutionmodel(upscale_factor=3)

modelurl = 'https://s3.amazonaws.com/pytorch/test_data/export/superres_epoch100-44c6958e.pth'
batch_size = 1   


maplocation = lambda storage, loc: storage
if torch.cuda.is_available():
    map_location = None
torch_model.load_state_dict(modelzoo.load_url(modelurl, map_location=maplocation))


torch_model.eval()
PyTorch save model to ONNX
PyTorch save the model to ONNX
  • X = torch.randn(batch_size, 1, 224, 224, requires_grad=True) is used as input to the model.
  • torch.onnx.export() is used to export the model.
  • ortinputs = {ort_session.get_inputs()[0].name: to_numpy(x)} is used to compute the onnx runtime output prediction.
  • print(“Export model is tested with ONNXRuntime, and the result of the model looks good!”) is used to print the model result.
X = torch.randn(batch_size, 1, 224, 224, requires_grad=True)
torch_out = torch_model(x)

torch.onnx.export(torch_model,               
                  x,                         
                  "super_resolution.onnx",  
                  export_params=True,    
                  opset_version=10,          
                  do_constant_folding=True,  
                  input_names = ['input'],  
                  output_names = ['output'], 
                  dynamic_axes={'input' : {0 : 'batch_size'},   
                                'output' : {0 : 'batch_size'}})
import onnx

onnxmodel = onnx.load("super_resolution.onnx")
onnx.checker.check_model(onnxmodel)
import onnxruntime

ortsession = onnxruntime.InferenceSession("super_resolution.onnx")

def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

ortinputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ortoutputs = ort_session.run(None, ortinputs)

# compare ONNX Runtime and PyTorch results
num.testing.assert_allclose(to_numpy(torch_out), ortoutputs[0], rtol=1e-03, atol=1e-05)

print("Export model is tested with ONNXRuntime, and the result of the model looks good!")
PyTorch save model tested with ONNX runtime
PyTorch save model tested with ONNX runtime

So, in this tutorial, we discussed PyTorch Save Model and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • PyTorch save model
  • PyTorch save model example
  • PyTorch save model checkpoint
  • PyTorch save model architecture
  • PyTorch save the model for inference
  • PyTorch save the model during training
  • PyTorch save the model to onnx