PyTorch RNN – Detailed Guide

In this Python tutorial, we will learn about the PyTorch RNN in Python and we will also cover different examples related to PyTorch RNN. And we will cover these topics.

  • PyTorch RNN
  • PyTorch RNN example
  • PyTorch RNN cell
  • PyTorch RNN activation function
  • PyTorch RNN binary classification
  • PyTorch RNN sentiment analysis
  • PyTorch RNN language model
  • PyTorch RNN Dataloader

Also, check the latest PyTorch tutorial: PyTorch Activation Function

PyTorch RNN

In this section, we will learn about the PyTorch RNN model in python.

RNN stands for Recurrent Neural Network it is a class of artificial neural networks that uses sequential data or time-series data. It is mainly used for ordinal or temporal problems.

Syntax:

The syntax of PyTorch RNN:

torch.nn.RNN(input_size, hidden_layer, num_layer, bias=True, batch_first=False, dropout = 0, bidirectional = False)

Parameters:

  • input_size: In input x the number of expected features.
  • hidden_layer: The number of features in the hidden state.
  • num_layer: The num_layer is used as several recurrent layers.
  • bias: If the bias is False then the layer does not use bias weights.
  • batch_first: If batch_first is True then input and output tensors are provided ( batch, seq, feature) instead of (seq, batch, feature). The default value of batch_first is False.
  • dropout: If non-zero, initiate the dropout layer on the output of each RNN layer excluding the last layer with a dropout probability equal to dropout. The default value of dropout is 0.
  • bidirectional: If True, then it becomes a bidirectional RNN. The default value of bidirectional is False.

Read: PyTorch Tensor to Numpy

PyTorch RNN example

In this section, we will learn about how to implement the PyTorch RNN example in python.

A Recurrent Neural Network is a kind of neural network where the output from the earlier step is sustained as input to the current step.

The important feature of RNN is the hidden state, which recalls some information about the sequence.

Code:

In the following code, we will import the torch module from which we can calculate the output of RNN.

  • recnn = nn.RNN(12, 22, 4) is used to work on a sequence prediction problem using
  • RNN.
  • inp = torch.randn(7, 5, 12) is used to generate random numbers as input.
  • outp, hn = recnn(inp, h) is used to get the output.
  • print(outp) is used to print the output on the screen.
import torch
import torch.nn as nn
recnn = nn.RNN(12, 22, 4)
inp = torch.randn(7, 5, 12)
h = torch.randn(4, 5, 22)
outp, hn = recnn(inp, h)
print(outp)

Output:

After running the above code, we get the following output in which we can see that the PyTorch RNN value is printed on the screen.

PyTorch RNN example
PyTorch RNN example

Read: PyTorch Batch Normalization

PyTorch RNN Cell

In this section, we will learn about the PyTorch RNN cell in python.

RNN cell is anything that has a state and executes some operations that take a matrix of inputs.

RNN cells differentiate themselves from the systematic neurons in the sense they have a state and can remember information from the past.

syntax:

torch.nn.RNNCell(input_size, hidden_size, bias = True, nonlinearity = 'tanh', device = None, dtype = None)

Parameters:

  • input_size the number of expected features in the input x.
  • hidden_size the number of features in the hidden state as h.
  • bias If the bias is False then the layer does not use bias weight. The default value of bias is True.
  • nonlinearity The default nonlinearity is tanh. It used can be either tanh or relu.

Read: Keras Vs PyTorch – Key Differences

PyTorch RNN activation function

In this section, we will learn about the PyTorch RNN activation function in python.

The PyTorch RNN activation function is defined as how the weighted sum of input is altered into an output from a node or nodes in a layer of the network.

Code:

In the following code, we will import the torch module from which the activation function of rnn model start working.

  • traindt = dtsets.MNIST(root=’./data’, train=True, transform=transform.ToTensor(), download=True) is used as a dataset.
  • self.hidendim = hidendim is used as a hidden dimension.
  • self.layerdim = layerdim is used as a number of hidden layers.
  • self.rnn = nn.RNN(inpdim, hidendim, layerdim, batch_first=True, nonlinearity=’relu’) is used to build a rnn model.
  • self.fc = nn.Linear(hidendim, outpdim) is used as a read out layer.
  • h = torch.zeros(self.layerdim, y.size(0), self.hidendim).requires_grad_() initialize hidden state with zeros.
  • outp = self.fc(outp[:, -1, :]) is used as index hidden state of last time object.
  • optim = torch.optim.SGD(mdl.parameters(), lr=l_r) is used to initialize the optimizer.
  • imgs = imgs.view(-1, seqdim, inpdim).requires_grad_() is used to load images as tensor with gradient
  • optim.zero_grad() is used as clear gradient with respect to parameter.
  • loss = criter(outps, lbls) is used to calculate the loss.
  • optim.step() is used as updating parameter.
  • outps = mdl(imgs) is used as a forward pass only to get outputs.
  • _, predicted = torch.max(outps.data, 1) is used to getting prediction from the maximum value.
  • ttl += lbls.size(0) is used as total number of labels.
  • crrct += (predicted == lbls).sum() is used as total correct predictions.
  • print(‘Iteration: {}. Loss: {}. Accuracy: {}’.format(iter, loss.item(), accu)) is used to print as a accuracy on the screen.
import torch
import torch.nn as nn
import torchvision.transforms as transform
import torchvision.datasets as dtsets
traindt = dtsets.MNIST(root='./data', 
                            train=True, 
                            transform=transform.ToTensor(),
                            download=True)

testdt = dtsets.MNIST(root='./data', 
                           train=False, 
                           transform=transform.ToTensor())
batchsiz = 80
nitrs = 2800
numepoch = nitrs / (len(traindt) / batchsiz)
numepoch = int(numepoch)

trainldr = torch.utils.data.DataLoader(dataset=traindt, 
                                           batch_size=batchsiz, 
                                           shuffle=True)

testldr = torch.utils.data.DataLoader(dataset=testdt, 
                                          batch_size=batchsiz, 
                                          shuffle=False)
class rnn(nn.Module):
    def __init__(self, inpdim, hidendim, layerdim, outpdim):
        super(rnn, self).__init__()
        self.hidendim = hidendim

        self.layerdim = layerdim

        self.rnn = nn.RNN(inpdim, hidendim, layerdim, batch_first=True, nonlinearity='relu')
        self.fc = nn.Linear(hidendim, outpdim)

    def forward(self, y):
        h = torch.zeros(self.layerdim, y.size(0), self.hidendim).requires_grad_()
        outp, hx = self.rnn(y, h.detach())
        outp = self.fc(outp[:, -1, :]) 
        return outp
inpdim = 28
hidendim = 80
layerdim = 1
outpdim = 10
mdl = rnn(inpdim, hidendim, layerdim, outpdim)
criter= nn.CrossEntropyLoss()
l_r = 0.01

optim = torch.optim.SGD(mdl.parameters(), lr=l_r)  
list(mdl.parameters())[0].size()
seqdim = 28  

itr = 0
for epoch in range(numepoch):
    for x, (imgs, lbls) in enumerate(trainldr):
        mdl.train()
        imgs = imgs.view(-1, seqdim, inpdim).requires_grad_()
        optim.zero_grad()
        outps = mdl(imgs)
        loss = criter(outps, lbls)
        loss.backward()

        optim.step()

        itr += 1

        if itr % 500 == 0:
            mdl.eval()       
            crrct = 0
            ttl = 0
            for imgs, lbls in testldr:
                imgs = imgs.view(-1, seqdim, inpdim)

                outps = mdl(imgs)
                _, predicted = torch.max(outps.data, 1)

                ttl += lbls.size(0)

                crrct += (predicted == lbls).sum()

            accu = 100 * crrct / ttl

            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accu))

Output:

In the following output, we can see that the accuracy of the rnn model is printed on the screen.

PyTorch rnn activation function
PyTorch rnn activation function

Read: PyTorch Save Model – Complete Guide

PyTorch RNN binary classification

In this section, we will learn about the PyTorch RNN binary classification in python.

Binary classification can predict one or two classes or multiple class classification which involves predicting one of more than two classes.

Code:

In the following code, we will import the torch module from which we can predict one or two classes with the help of binary classification.

  • device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’) is used as device configuration.
  • nn.Linear() is used to create a feed-forward neural network.
  • modl = RNNModel(inpsize, hidsize, numlayrs, numclases).to(device).to(device) is used to initialize a RNN model.
  • optim = optim.Adam(modl.parameters(), lr = 0.01) is used to initialize the optimizer.
  • print(f”num_epochs: {numepchs}”) is used to print the number of epochs.
import torch
import torch.nn as nn
from torchvision import datasets as dtsets
from torchvision.transforms import ToTensor
traindt = dtsets.MNIST(
    root = 'data',
    train = True,                         
    transform = ToTensor(), 
    download = True,            
)
testdt = dtsets.MNIST(
    root = 'data', 
    train = False, 
    transform = ToTensor()
)
from torch.utils.data import DataLoader
ldrs = {
    'train' : torch.utils.data.DataLoader(traindt, 
                                          batch_size=100, 
                                          shuffle=True, 
                                          num_workers=1),
    
    'test'  : torch.utils.data.DataLoader(testdt, 
                                          batch_size=100, 
                                          shuffle=True, 
                                          num_workers=1),
}
ldrs

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device
seqleng = 32
inpsize = 32
hidsize = 132
numlayrs = 6
numclases = 14
batchsiz = 100
numepchs = 6
l_r = 0.01
class RNNModel(nn.Module):
    
    def __init__(self, inpsiz, hidsize, numlayrs, numclases):
        super(RNNModel, self).__init__()
        self.hidden_size = hidsize
        self.num_layers = numlayrs
        self.lstm = nn.LSTM(inpsize, hidsize, numlayrs, batch_first=True)
        self.fc = nn.Linear(hidsize, numclases)
        

modl = RNNModel(inpsize, hidsize, numlayrs, numclases).to(device).to(device)
print(modl)
losfunc = nn.CrossEntropyLoss()
losfunc
from torch import optim
optim = optim.Adam(modl.parameters(), lr = 0.01)   
optim
def train(numepchs, modl, ldrs):
    print(f"num_epochs: {numepchs}")
    print(f"model: {modl}")
    print(f"loaders['train']: {ldrs['train']}")
    
   
train(numepchs, modl, ldrs)

Output:

In the following output, we can see that the PyTorch RNN binary classification is done on the screen.

PyTorch RNN binary classification
PyTorch RNN binary classification

Read: PyTorch fully connected layer

PyTorch RNN sentiment analysis

In this section, we will learn about the PyTorch RNN sentiment analysis in python.

Before moving forward, we should have some piece of knowledge about Sentiment Analysis.

Sentiment Analysis is a predictive modeling task where the model is trained to predict the duality of textual data like positive, negative, or neutral.

Code:

  • characts = set(”.join(text)) is used to join all the sentence together and extract the unique character.
  • int2char = dict(enumerate(characts)) is used to map integer to character.
  • inpseq.append(text[x][:-1]) is used to remove the last character for input sequence.
  • targseq.append(text[x][1:]) is used to remove the first character from target sequence.
  • features = np.zeros((batchsiz, seqleng, dicsiz), dtype=np.float32) creating multidimension array with desired output shapes.
  • hiden = self.init_hidden(batchsiz) is used to initializing the hidden state.
  • modl = RNNModel(inpsize=dicsiz, outpsize=dicsiz, hidendim=12, nlayrs=1) instantiate the model with hyperparameter.
  • optim = torch.optim.Adam(modl.parameters(), lr=l_r) is used to initialize the optimizer.
  • print(‘Epochs: {}/{}………….’.format(epoch, nepchs), end=’ ‘) is used to print the epochs.
import torch
from torch import nn

import numpy as np
text = ['hey Guides','How are you','Have a nice day']


characts = set(''.join(text))

int2char = dict(enumerate(characts))

char2int = {char: ind for ind, char in int2char.items()}

maxleng = len(max(text, key=len))

for x in range(len(text)):
  while len(text[x])<maxleng:
      text[x] += ' '

inpseq = []
targseq = []

for x in range(len(text)):

  inpseq.append(text[x][:-1])

  targseq.append(text[x][1:])
  print("Input Sequence: {}\nTarget Sequence: {}".format(inpseq[x], targseq[x]))
for i in range(len(text)):
    inpseq[i] = [char2int[character] for character in inpseq[x]]
    targseq[i] = [char2int[character] for character in targseq[x]]
dicsiz = len(char2int)
seqleng = maxleng - 1
batchsiz = len(text)

def one_hot_encode(sequen, dicsiz, seqleng, batchsiz):
    features = np.zeros((batchsiz, seqleng, dicsiz), dtype=np.float32)

    for x in range(batchsiz):
        for y in range(seqleng):
            features[x, y, sequen[x][y]] = 1
    return features

inpseq = one_hot_encode(inpseq, dicsiz, seqleng, batchsiz)
inpseq = torch.from_numpy(inpseq)
target_seq = torch.Tensor(targseq)

is_cuda = torch.cuda.is_available()

if is_cuda:
    device = torch.device("cuda")
    print("gpu is available")
else:
    device = torch.device("cpu")
    print("gpu is not available, CPU used")
class RNNModel(nn.Module):
    def __init__(self, inpsize, outpsize, hidendim, nlayrs):
        super(RNNModel, self).__init__()

        # Defining some parameters
        self.hidendim = hidendim
        self.nlayrs = nlayrs

        #Defining the layers
        self.rnn = nn.RNN(inpsize, hidendim, nlayrs, batch_first=True)   
        # Fully connected layer
        self.fc = nn.Linear(hidendim, outpsize)
    
    def forward(self, z):
        
        batchsiz = z.size(0)
        hiden = self.init_hidden(batchsiz)


        outp, hiden = self.rnn(z, hiden)

        outp = outp.contiguous().view(-1, self.hidendim)
        outp = self.fc(outp)
        
        return outp, hiden
    
    def init_hidden(self, batchsiz):
        
        hiden = torch.zeros(self.nlayrs, batchsiz, self.hidendim)
        return hiden

modl = RNNModel(inpsize=dicsiz, outpsize=dicsiz, hidendim=12, nlayrs=1)
modl.to(device)
nepchs = 100
l_r=0.01
criter = nn.CrossEntropyLoss()
optim = torch.optim.Adam(modl.parameters(), lr=l_r)
for epoch in range(1, nepchs + 1):
        optim.zero_grad()
        inpseq.to(device)
        outp, hiden = modl(inpseq)
        loss = criter(outp, target_seq.view(-1).long())
        loss.backward() 
        optim.step() 

        if epoch%10 == 0:
            print('Epochs: {}/{}.............'.format(epoch, nepchs), end=' ')
            print("Loss: {:.4f}".format(loss.item()))

Output:

After running the above code, we get the following output in which we can see that the epochs and loss are printed on the screen.

PyTorch RNN sentiment analysis
PyTorch RNN sentiment analysis

Read: PyTorch MNIST Tutorial

PyTorch RNN language model

In this section, we will learn about the PyTorch RNN language model in python.

  • RNN language model is a kind of neural net language model which carry the RNN model in the network.
  • RNN is acceptable for modeling sequential data like in natural language.

Code:

In the following code, we will import the torch module from which we know about the RNN learning model.

  • traindt = dtsets.MNIST(root=’dataset/’, train=True, transform=transforms.ToTensor(), download=True) is used to load the dataset.
  • modl = RNNlM(inpsize, hidensize, numlayrs, numclasses, sequlen).to(device) is used to initializing the RNN model.
  • optim = optim.Adam(modl.parameters(), lr=l_r) is used to initialize the optimizer.
  • print(f’Got {numcrct}/{numsmples} with accuracy {float(numcrct)/float(numsmples)*100:.2f}’) modl.train() is used to print the accuracy of the model.
import torch
from tqdm import tqdm
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as dtsets
from torch.utils.data import DataLoader
from torchvision.transforms import transforms

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Declaring Hyper-parameters
inpsize = 28
sequlen = 28
numlayrs = 2
hidensize = 254
numclasses = 10
l_r = 0.001
batchsiz = 62
numepchs = 2
class RNNlM(nn.Module):
   
   def __init__(self, inpsize, hidensize, numlayrs, numclasses, sequlen):
      super(RNNlM, self).__init__()
      self.hidensize = hidensize
      self.numlayrs = numlayrs
      self.lstm = nn.LSTM(inpsize, hidensize, numlayrs, batch_first=True)
      self.fc = nn.Linear(hidensize*sequlen, numclasses)
   
   def forward(self, data):
      h = torch.zeros(self.numlayrs, data.size(0), self.hidensize).to(device)
      c = torch.zeros(self.numlayrs, data.size(0), self.hidensize).to(device)
      
      outp, _ = self.lstm(data, (h, c))
      outp = outp.reshape(outp.shape[0], -1)
      outp = self.fc(outp)
      return outp

traindt = dtsets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True)
testdt = dtsets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True)

trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True)
testldr = DataLoader(dataset=testdt, batch_size=batchsiz, shuffle=True)

modl = RNNlM(inpsize, hidensize, numlayrs, numclasses, sequlen).to(device)

criter = nn.CrossEntropyLoss()
optim = optim.Adam(modl.parameters(), lr=l_r)

# Training Loop
ep = 1
for epoch in tqdm(range(numepchs), desc=f'Training model for epoch {ep}/{numepchs}', total=numepchs):
   for batch_idx, (data, trgt) in enumerate(trainldr):
      data = data.to(device).squeeze(1)
      trgts = trgt.to(device)
      scores = modl(data)
      loss = criter(scores, trgts)
      optim.zero_grad()
      loss.backward()
      optim.step()
   print(f'epoch: {epoch + 1} step: {batch_idx + 1}/{len(trainldr)} loss: {loss}')
   ep += 1
   # Evaluating our RNN model
def check_accuracy(ldr, modlrnnlm):
   if ldr.dataset.train:
      print('Check accuracy on training data')
   else:
      print('Check accuracy on test data')
   
   numcrct = 0
   numsmples = 0
   modlrnnlm.eval()
   with torch.no_grad():
      for i,j in ldr:
         i = i.to(device).squeeze(1)
         j = j.to(device)
         score = modlrnnlm(i)
         _, predictions = score.max(1)
         numcrct += (predictions == j).sum()
         numsmples += predictions.size(0)
      
      print(f'Got {numcrct}/{numsmples} with accuracy {float(numcrct)/float(numsmples)*100:.2f}')
   modl.train()
   

check_accuracy(trainldr, modl)
check_accuracy(testldr, modl)

Output:

In the following output, we can see that the accuracy of the train data and test data is printed on the screen.

PyTorch RNN learning model
PyTorch RNN learning model

Read: PyTorch Model Summary

PyTorch RNN Dataloader

In this section, we will learn about the PyTorch RNN dataloader in python.

A dataset loads training or test data into memory or dataloader get the data from the dataset and distribute the data in the batches.

Code:

In the following code, we will import the torch module from which we can load the RNN dataset.

  • class RNN(nn.Module): is used to define the RNN class.
  • traindt = datasets.MNIST(root=’dataset/’, train=True, transform=transforms.ToTensor(), download=True) is used as a dataset.
  • trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True) is used to load the dataset.
import torch
from tqdm import tqdm
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
from torchvision.transforms import transforms
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
inpsize = 30
seqlen = 30
numlayrs = 4
hidensize = 258
numclasses = 12
lr = 0.001
batchsiz = 66
numepchs = 4
class RNN(nn.Module):
   
   def __init__(self, inpsize, hidensize, numlayrs, numclasses, seqlen):
      super(RNN, self).__init__()
      self.hidensize = hidensize
      self.numlayrs = numlayrs
      self.lstm = nn.LSTM(inpsize, hidensize, numlayrs, batch_first=True)
      self.fc = nn.Linear(hidensize*seqlen, numclasses)
   
   def forward(self, data):
      h1 = torch.zeros(self.numlayers, data.size(0), self.hidensize).to(device)
      c1 = torch.zeros(self.numlayers, data.size(0), self.hidensize).to(device)
      
      outp, _ = self.lstm(data, (h1, c1))
      outp = outp.reshape(outp.shape[0], -1)
      outp = self.fc(outp)
      return outp
     
traindt = datasets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True)
testdt = datasets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True)

trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True)
testldr = DataLoader(dataset=testdt, batch_size=batchsiz, shuffle=True)

Output:

After running the above code, we get the following output in which we can see that the RNN Model data can be loaded on the screen.

PyTorch RNN dataloader
PyTorch RNN data loader

You may also like to read the following PyTorch tutorials.

So, in this tutorial, we discussed PyTorch RNN and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • PyTorch RNN
  • PyTorch RNN example
  • PyTorch RNN cell
  • PyTorch RNN activation function
  • PyTorch RNN binary classification
  • PyTorch RNN sentiment analysis
  • PyTorch RNN language model
  • PyTorch RNN Dataloader