In this Python tutorial, we will learn about the PyTorch RNN in Python and we will also cover different examples related to PyTorch RNN. And we will cover these topics.
- PyTorch RNN
- PyTorch RNN example
- PyTorch RNN cell
- PyTorch RNN activation function
- PyTorch RNN binary classification
- PyTorch RNN sentiment analysis
- PyTorch RNN language model
- PyTorch RNN Dataloader
Also, check the latest PyTorch tutorial: PyTorch Activation Function
PyTorch RNN
In this section, we will learn about the PyTorch RNN model in python.
RNN stands for Recurrent Neural Network it is a class of artificial neural networks that uses sequential data or time-series data. It is mainly used for ordinal or temporal problems.
Syntax:
The syntax of PyTorch RNN:
torch.nn.RNN(input_size, hidden_layer, num_layer, bias=True, batch_first=False, dropout = 0, bidirectional = False)
Parameters:
- input_size: In input x the number of expected features.
- hidden_layer: The number of features in the hidden state.
- num_layer: The num_layer is used as several recurrent layers.
- bias: If the bias is False then the layer does not use bias weights.
- batch_first: If batch_first is True then input and output tensors are provided ( batch, seq, feature) instead of (seq, batch, feature). The default value of batch_first is False.
- dropout: If non-zero, initiate the dropout layer on the output of each RNN layer excluding the last layer with a dropout probability equal to dropout. The default value of dropout is 0.
- bidirectional: If True, then it becomes a bidirectional RNN. The default value of bidirectional is False.
Read: PyTorch Tensor to Numpy
PyTorch RNN example
In this section, we will learn about how to implement the PyTorch RNN example in python.
A Recurrent Neural Network is a kind of neural network where the output from the earlier step is sustained as input to the current step.
The important feature of RNN is the hidden state, which recalls some information about the sequence.
Code:
In the following code, we will import the torch module from which we can calculate the output of RNN.
- recnn = nn.RNN(12, 22, 4) is used to work on a sequence prediction problem using
- RNN.
- inp = torch.randn(7, 5, 12) is used to generate random numbers as input.
- outp, hn = recnn(inp, h) is used to get the output.
- print(outp) is used to print the output on the screen.
import torch
import torch.nn as nn
recnn = nn.RNN(12, 22, 4)
inp = torch.randn(7, 5, 12)
h = torch.randn(4, 5, 22)
outp, hn = recnn(inp, h)
print(outp)
Output:
After running the above code, we get the following output in which we can see that the PyTorch RNN value is printed on the screen.
Read: PyTorch Batch Normalization
PyTorch RNN Cell
In this section, we will learn about the PyTorch RNN cell in python.
RNN cell is anything that has a state and executes some operations that take a matrix of inputs.
RNN cells differentiate themselves from the systematic neurons in the sense they have a state and can remember information from the past.
syntax:
torch.nn.RNNCell(input_size, hidden_size, bias = True, nonlinearity = 'tanh', device = None, dtype = None)
Parameters:
- input_size the number of expected features in the input x.
- hidden_size the number of features in the hidden state as h.
- bias If the bias is False then the layer does not use bias weight. The default value of bias is True.
- nonlinearity The default nonlinearity is tanh. It used can be either tanh or relu.
Read: Keras Vs PyTorch – Key Differences
PyTorch RNN activation function
In this section, we will learn about the PyTorch RNN activation function in python.
The PyTorch RNN activation function is defined as how the weighted sum of input is altered into an output from a node or nodes in a layer of the network.
Code:
In the following code, we will import the torch module from which the activation function of rnn model start working.
- traindt = dtsets.MNIST(root=’./data’, train=True, transform=transform.ToTensor(), download=True) is used as a dataset.
- self.hidendim = hidendim is used as a hidden dimension.
- self.layerdim = layerdim is used as a number of hidden layers.
- self.rnn = nn.RNN(inpdim, hidendim, layerdim, batch_first=True, nonlinearity=’relu’) is used to build a rnn model.
- self.fc = nn.Linear(hidendim, outpdim) is used as a read out layer.
- h = torch.zeros(self.layerdim, y.size(0), self.hidendim).requires_grad_() initialize hidden state with zeros.
- outp = self.fc(outp[:, -1, :]) is used as index hidden state of last time object.
- optim = torch.optim.SGD(mdl.parameters(), lr=l_r) is used to initialize the optimizer.
- imgs = imgs.view(-1, seqdim, inpdim).requires_grad_() is used to load images as tensor with gradient
- optim.zero_grad() is used as clear gradient with respect to parameter.
- loss = criter(outps, lbls) is used to calculate the loss.
- optim.step() is used as updating parameter.
- outps = mdl(imgs) is used as a forward pass only to get outputs.
- _, predicted = torch.max(outps.data, 1) is used to getting prediction from the maximum value.
- ttl += lbls.size(0) is used as total number of labels.
- crrct += (predicted == lbls).sum() is used as total correct predictions.
- print(‘Iteration: {}. Loss: {}. Accuracy: {}’.format(iter, loss.item(), accu)) is used to print as a accuracy on the screen.
import torch
import torch.nn as nn
import torchvision.transforms as transform
import torchvision.datasets as dtsets
traindt = dtsets.MNIST(root='./data',
train=True,
transform=transform.ToTensor(),
download=True)
testdt = dtsets.MNIST(root='./data',
train=False,
transform=transform.ToTensor())
batchsiz = 80
nitrs = 2800
numepoch = nitrs / (len(traindt) / batchsiz)
numepoch = int(numepoch)
trainldr = torch.utils.data.DataLoader(dataset=traindt,
batch_size=batchsiz,
shuffle=True)
testldr = torch.utils.data.DataLoader(dataset=testdt,
batch_size=batchsiz,
shuffle=False)
class rnn(nn.Module):
def __init__(self, inpdim, hidendim, layerdim, outpdim):
super(rnn, self).__init__()
self.hidendim = hidendim
self.layerdim = layerdim
self.rnn = nn.RNN(inpdim, hidendim, layerdim, batch_first=True, nonlinearity='relu')
self.fc = nn.Linear(hidendim, outpdim)
def forward(self, y):
h = torch.zeros(self.layerdim, y.size(0), self.hidendim).requires_grad_()
outp, hx = self.rnn(y, h.detach())
outp = self.fc(outp[:, -1, :])
return outp
inpdim = 28
hidendim = 80
layerdim = 1
outpdim = 10
mdl = rnn(inpdim, hidendim, layerdim, outpdim)
criter= nn.CrossEntropyLoss()
l_r = 0.01
optim = torch.optim.SGD(mdl.parameters(), lr=l_r)
list(mdl.parameters())[0].size()
seqdim = 28
itr = 0
for epoch in range(numepoch):
for x, (imgs, lbls) in enumerate(trainldr):
mdl.train()
imgs = imgs.view(-1, seqdim, inpdim).requires_grad_()
optim.zero_grad()
outps = mdl(imgs)
loss = criter(outps, lbls)
loss.backward()
optim.step()
itr += 1
if itr % 500 == 0:
mdl.eval()
crrct = 0
ttl = 0
for imgs, lbls in testldr:
imgs = imgs.view(-1, seqdim, inpdim)
outps = mdl(imgs)
_, predicted = torch.max(outps.data, 1)
ttl += lbls.size(0)
crrct += (predicted == lbls).sum()
accu = 100 * crrct / ttl
print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accu))
Output:
In the following output, we can see that the accuracy of the rnn model is printed on the screen.
Read: PyTorch Save Model – Complete Guide
PyTorch RNN binary classification
In this section, we will learn about the PyTorch RNN binary classification in python.
Binary classification can predict one or two classes or multiple class classification which involves predicting one of more than two classes.
Code:
In the following code, we will import the torch module from which we can predict one or two classes with the help of binary classification.
- device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’) is used as device configuration.
- nn.Linear() is used to create a feed-forward neural network.
- modl = RNNModel(inpsize, hidsize, numlayrs, numclases).to(device).to(device) is used to initialize a RNN model.
- optim = optim.Adam(modl.parameters(), lr = 0.01) is used to initialize the optimizer.
- print(f”num_epochs: {numepchs}”) is used to print the number of epochs.
import torch
import torch.nn as nn
from torchvision import datasets as dtsets
from torchvision.transforms import ToTensor
traindt = dtsets.MNIST(
root = 'data',
train = True,
transform = ToTensor(),
download = True,
)
testdt = dtsets.MNIST(
root = 'data',
train = False,
transform = ToTensor()
)
from torch.utils.data import DataLoader
ldrs = {
'train' : torch.utils.data.DataLoader(traindt,
batch_size=100,
shuffle=True,
num_workers=1),
'test' : torch.utils.data.DataLoader(testdt,
batch_size=100,
shuffle=True,
num_workers=1),
}
ldrs
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device
seqleng = 32
inpsize = 32
hidsize = 132
numlayrs = 6
numclases = 14
batchsiz = 100
numepchs = 6
l_r = 0.01
class RNNModel(nn.Module):
def __init__(self, inpsiz, hidsize, numlayrs, numclases):
super(RNNModel, self).__init__()
self.hidden_size = hidsize
self.num_layers = numlayrs
self.lstm = nn.LSTM(inpsize, hidsize, numlayrs, batch_first=True)
self.fc = nn.Linear(hidsize, numclases)
modl = RNNModel(inpsize, hidsize, numlayrs, numclases).to(device).to(device)
print(modl)
losfunc = nn.CrossEntropyLoss()
losfunc
from torch import optim
optim = optim.Adam(modl.parameters(), lr = 0.01)
optim
def train(numepchs, modl, ldrs):
print(f"num_epochs: {numepchs}")
print(f"model: {modl}")
print(f"loaders['train']: {ldrs['train']}")
train(numepchs, modl, ldrs)
Output:
In the following output, we can see that the PyTorch RNN binary classification is done on the screen.
Read: PyTorch fully connected layer
PyTorch RNN sentiment analysis
In this section, we will learn about the PyTorch RNN sentiment analysis in python.
Before moving forward, we should have some piece of knowledge about Sentiment Analysis.
Sentiment Analysis is a predictive modeling task where the model is trained to predict the duality of textual data like positive, negative, or neutral.
Code:
- characts = set(”.join(text)) is used to join all the sentence together and extract the unique character.
- int2char = dict(enumerate(characts)) is used to map integer to character.
- inpseq.append(text[x][:-1]) is used to remove the last character for input sequence.
- targseq.append(text[x][1:]) is used to remove the first character from target sequence.
- features = np.zeros((batchsiz, seqleng, dicsiz), dtype=np.float32) creating multidimension array with desired output shapes.
- hiden = self.init_hidden(batchsiz) is used to initializing the hidden state.
- modl = RNNModel(inpsize=dicsiz, outpsize=dicsiz, hidendim=12, nlayrs=1) instantiate the model with hyperparameter.
- optim = torch.optim.Adam(modl.parameters(), lr=l_r) is used to initialize the optimizer.
- print(‘Epochs: {}/{}………….’.format(epoch, nepchs), end=’ ‘) is used to print the epochs.
import torch
from torch import nn
import numpy as np
text = ['hey Guides','How are you','Have a nice day']
characts = set(''.join(text))
int2char = dict(enumerate(characts))
char2int = {char: ind for ind, char in int2char.items()}
maxleng = len(max(text, key=len))
for x in range(len(text)):
while len(text[x])<maxleng:
text[x] += ' '
inpseq = []
targseq = []
for x in range(len(text)):
inpseq.append(text[x][:-1])
targseq.append(text[x][1:])
print("Input Sequence: {}\nTarget Sequence: {}".format(inpseq[x], targseq[x]))
for i in range(len(text)):
inpseq[i] = [char2int[character] for character in inpseq[x]]
targseq[i] = [char2int[character] for character in targseq[x]]
dicsiz = len(char2int)
seqleng = maxleng - 1
batchsiz = len(text)
def one_hot_encode(sequen, dicsiz, seqleng, batchsiz):
features = np.zeros((batchsiz, seqleng, dicsiz), dtype=np.float32)
for x in range(batchsiz):
for y in range(seqleng):
features[x, y, sequen[x][y]] = 1
return features
inpseq = one_hot_encode(inpseq, dicsiz, seqleng, batchsiz)
inpseq = torch.from_numpy(inpseq)
target_seq = torch.Tensor(targseq)
is_cuda = torch.cuda.is_available()
if is_cuda:
device = torch.device("cuda")
print("gpu is available")
else:
device = torch.device("cpu")
print("gpu is not available, CPU used")
class RNNModel(nn.Module):
def __init__(self, inpsize, outpsize, hidendim, nlayrs):
super(RNNModel, self).__init__()
# Defining some parameters
self.hidendim = hidendim
self.nlayrs = nlayrs
#Defining the layers
self.rnn = nn.RNN(inpsize, hidendim, nlayrs, batch_first=True)
# Fully connected layer
self.fc = nn.Linear(hidendim, outpsize)
def forward(self, z):
batchsiz = z.size(0)
hiden = self.init_hidden(batchsiz)
outp, hiden = self.rnn(z, hiden)
outp = outp.contiguous().view(-1, self.hidendim)
outp = self.fc(outp)
return outp, hiden
def init_hidden(self, batchsiz):
hiden = torch.zeros(self.nlayrs, batchsiz, self.hidendim)
return hiden
modl = RNNModel(inpsize=dicsiz, outpsize=dicsiz, hidendim=12, nlayrs=1)
modl.to(device)
nepchs = 100
l_r=0.01
criter = nn.CrossEntropyLoss()
optim = torch.optim.Adam(modl.parameters(), lr=l_r)
for epoch in range(1, nepchs + 1):
optim.zero_grad()
inpseq.to(device)
outp, hiden = modl(inpseq)
loss = criter(outp, target_seq.view(-1).long())
loss.backward()
optim.step()
if epoch%10 == 0:
print('Epochs: {}/{}.............'.format(epoch, nepchs), end=' ')
print("Loss: {:.4f}".format(loss.item()))
Output:
After running the above code, we get the following output in which we can see that the epochs and loss are printed on the screen.
Read: PyTorch MNIST Tutorial
PyTorch RNN language model
In this section, we will learn about the PyTorch RNN language model in python.
- RNN language model is a kind of neural net language model which carry the RNN model in the network.
- RNN is acceptable for modeling sequential data like in natural language.
Code:
In the following code, we will import the torch module from which we know about the RNN learning model.
- traindt = dtsets.MNIST(root=’dataset/’, train=True, transform=transforms.ToTensor(), download=True) is used to load the dataset.
- modl = RNNlM(inpsize, hidensize, numlayrs, numclasses, sequlen).to(device) is used to initializing the RNN model.
- optim = optim.Adam(modl.parameters(), lr=l_r) is used to initialize the optimizer.
- print(f’Got {numcrct}/{numsmples} with accuracy {float(numcrct)/float(numsmples)*100:.2f}’) modl.train() is used to print the accuracy of the model.
import torch
from tqdm import tqdm
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as dtsets
from torch.utils.data import DataLoader
from torchvision.transforms import transforms
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Declaring Hyper-parameters
inpsize = 28
sequlen = 28
numlayrs = 2
hidensize = 254
numclasses = 10
l_r = 0.001
batchsiz = 62
numepchs = 2
class RNNlM(nn.Module):
def __init__(self, inpsize, hidensize, numlayrs, numclasses, sequlen):
super(RNNlM, self).__init__()
self.hidensize = hidensize
self.numlayrs = numlayrs
self.lstm = nn.LSTM(inpsize, hidensize, numlayrs, batch_first=True)
self.fc = nn.Linear(hidensize*sequlen, numclasses)
def forward(self, data):
h = torch.zeros(self.numlayrs, data.size(0), self.hidensize).to(device)
c = torch.zeros(self.numlayrs, data.size(0), self.hidensize).to(device)
outp, _ = self.lstm(data, (h, c))
outp = outp.reshape(outp.shape[0], -1)
outp = self.fc(outp)
return outp
traindt = dtsets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True)
testdt = dtsets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True)
trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True)
testldr = DataLoader(dataset=testdt, batch_size=batchsiz, shuffle=True)
modl = RNNlM(inpsize, hidensize, numlayrs, numclasses, sequlen).to(device)
criter = nn.CrossEntropyLoss()
optim = optim.Adam(modl.parameters(), lr=l_r)
# Training Loop
ep = 1
for epoch in tqdm(range(numepchs), desc=f'Training model for epoch {ep}/{numepchs}', total=numepchs):
for batch_idx, (data, trgt) in enumerate(trainldr):
data = data.to(device).squeeze(1)
trgts = trgt.to(device)
scores = modl(data)
loss = criter(scores, trgts)
optim.zero_grad()
loss.backward()
optim.step()
print(f'epoch: {epoch + 1} step: {batch_idx + 1}/{len(trainldr)} loss: {loss}')
ep += 1
# Evaluating our RNN model
def check_accuracy(ldr, modlrnnlm):
if ldr.dataset.train:
print('Check accuracy on training data')
else:
print('Check accuracy on test data')
numcrct = 0
numsmples = 0
modlrnnlm.eval()
with torch.no_grad():
for i,j in ldr:
i = i.to(device).squeeze(1)
j = j.to(device)
score = modlrnnlm(i)
_, predictions = score.max(1)
numcrct += (predictions == j).sum()
numsmples += predictions.size(0)
print(f'Got {numcrct}/{numsmples} with accuracy {float(numcrct)/float(numsmples)*100:.2f}')
modl.train()
check_accuracy(trainldr, modl)
check_accuracy(testldr, modl)
Output:
In the following output, we can see that the accuracy of the train data and test data is printed on the screen.
Read: PyTorch Model Summary
PyTorch RNN Dataloader
In this section, we will learn about the PyTorch RNN dataloader in python.
A dataset loads training or test data into memory or dataloader get the data from the dataset and distribute the data in the batches.
Code:
In the following code, we will import the torch module from which we can load the RNN dataset.
- class RNN(nn.Module): is used to define the RNN class.
- traindt = datasets.MNIST(root=’dataset/’, train=True, transform=transforms.ToTensor(), download=True) is used as a dataset.
- trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True) is used to load the dataset.
import torch
from tqdm import tqdm
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
from torchvision.transforms import transforms
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
inpsize = 30
seqlen = 30
numlayrs = 4
hidensize = 258
numclasses = 12
lr = 0.001
batchsiz = 66
numepchs = 4
class RNN(nn.Module):
def __init__(self, inpsize, hidensize, numlayrs, numclasses, seqlen):
super(RNN, self).__init__()
self.hidensize = hidensize
self.numlayrs = numlayrs
self.lstm = nn.LSTM(inpsize, hidensize, numlayrs, batch_first=True)
self.fc = nn.Linear(hidensize*seqlen, numclasses)
def forward(self, data):
h1 = torch.zeros(self.numlayers, data.size(0), self.hidensize).to(device)
c1 = torch.zeros(self.numlayers, data.size(0), self.hidensize).to(device)
outp, _ = self.lstm(data, (h1, c1))
outp = outp.reshape(outp.shape[0], -1)
outp = self.fc(outp)
return outp
traindt = datasets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True)
testdt = datasets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True)
trainldr = DataLoader(dataset=traindt, batch_size=batchsiz, shuffle=True)
testldr = DataLoader(dataset=testdt, batch_size=batchsiz, shuffle=True)
Output:
After running the above code, we get the following output in which we can see that the RNN Model data can be loaded on the screen.
You may also like to read the following PyTorch tutorials.
- PyTorch Dataloader + Examples
- PyTorch Logistic Regression
- PyTorch Early Stopping + Examples
- PyTorch MSELoss – Detailed Guide
So, in this tutorial, we discussed PyTorch RNN and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.
- PyTorch RNN
- PyTorch RNN example
- PyTorch RNN cell
- PyTorch RNN activation function
- PyTorch RNN binary classification
- PyTorch RNN sentiment analysis
- PyTorch RNN language model
- PyTorch RNN Dataloader
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.