PyTorch Dataloader + Examples

In this Python tutorial, we will learn about PyTorch Dataloader in Python and we will also cover different examples related to PyTorch dataloader. Additionally, we will cover these topics.

  • PyTorch dataloader
  • PyTorch dataloader example
  • PyTorch dataloader from the directory
  • PyTorch dataloader train test split
  • PyTorch dataloader for text
  • PyTorch dataloader Cuda
  • PyTorch dataloader num_workers
  • Pytorch dataloder add dimension
  • PyTorch dataloader access dataset
  • PyTorch dataloader batch size
  • PyTorch dataloader epoch
  • PyTorch dataloader enumerate
  • PyTorch dataloader dataframe
  • PyTorch dataloader batch sampler

PyTorch Dataloader

In this section, we will learn about how the PyTorch dataloader works in python.

The Dataloader is defined as a process that combines the dataset and supplies an iteration over the given dataset. Dataloader is also used to import or export the data.

Syntax:

The following syntax is of using Dataloader in PyTorch:

DataLoader(dataset,batch_size=1,shuffle=False,sampler=None,batch_sampler=None,num_workers=0,collate_fn=None,pin_memory=False,drop_last=False,timeout=0,worker_init_fn=None)

Parameter:

The parameter used in Dataloader syntax:

  • Dataset: It is compulsory for the dataloader class to build with the dataset.
  • batch_size: It refers to the number of samples in every batch.
  • shuffle: It is used when we want to reshuffle the data.
  • Sampler: A sampler defines the scheme to recover the sample.
  • Batch_sampler: It is the same as the data sample.
  • num_workers: The number of processes for loading the data.
  • collate_fn: It collates the sample into batches.
  • pin_memory: The location used by the GPU for giving fast access to data.
  • drop_last: There is less number of elements in the last batch than the batch_size.
  • timeout: It is used to set the time for waiting while collecting the batch from workers.
  • worker_init_fn: It allows the routine of customers.

Also, check: Keras Vs PyTorch ā€“ Key Differences

PyTorch Dataloader Example

In this section, we will learn about how to implement the dataloader in PyTorch with the help of examples in python.

In this dataloader example, we can import the data, and after that export the data. The Dataloader can make the data loading very easy.

Code:

In the following code, we will import some libraries from which we can load the data.

  • warnings.filterwarnings(‘ignore’) is used to ignore the warnings.
  • plot.ion() is used to turn on the inactive mode.
  • landmarkFrame = pds.read_csv(‘face_landmarks.csv’) is used to read the CSV file.
  • landmarkFrame.iloc[x, 0] is used as integer location-based indexing.
  • num.asarray(landmark) is used to convert input to an array.
  • print(‘Image Name: {}’.format(imagename)) is used to print the image name on the screen.
  • print(‘Landmark Shape: {}’.format(landmark.shape)) is used to print the landmark shape on the screen.
  • print(‘First Six Landmark: {}’.format(landmark[:6])) is used to print the first six landmark on the screen.
from __future__ import print_function, division
import os
import torch
import pandas as pds
from skimage import io, transform
import numpy as num
import matplotlib.pyplot as plot
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils


import warnings
warnings.filterwarnings('ignore')

plot.ion() 
landmarkFrame = pds.read_csv('face_landmarks.csv')

x = 67
imagename = landmarkFrame.iloc[x, 0]
landmark = landmarkFrame.iloc[x, 1:]
landmark = num.asarray(landmark)
landmark = landmark.astype('float').reshape(-3, 4)

print('Image Name: {}'.format(imagename))
print('Landmark Shape: {}'.format(landmark.shape))
print('First Six Landmark: {}'.format(landmark[:6]))

Output:

After running the above code, we get the following output in which we can see that the ImageName, Landmark Shape, and First Six Landmark are printed on the screen.

PyTorch dataloader example
PyTorch dataloader example

Read: PyTorch Save Model 

PyTorch dataloader from the directory

In this section, we will learn about the PyTorch dataloader from the directory in python.

Dataloader takes the dataset from the directory. The directory is defined as the collection of files or subdirectories.

Code:

In the following code, we will import some libraries from which we can load the data from the directory.

  • trainingdata = datasets.FashionMNIST() is used as a training dataset.
  • testdata = datasets.FashionMNIST() is used as test dataset.
  • traindl = DataLoader(trainingdata, batch_size=60, shuffle=True) is used to load the training the data.
  • testdl = DataLoader(test_data, batch_size=60, shuffle=True) is used to load the test data.
  • print(f”Feature Batch Shape: {trainfeature.size()}”) is used to print the feature batch shape.
  • print(f”Label Batch Shape: {trainlabel.size()}”) is used to print the label batch feature.
  • plot.imshow(imgdir, cmap=”gray”) is used to plot the image on the screen.
  • print(f”Labels: {labels}”) is used to print the labels on the screen.
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plot


trainingdata = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

testdata = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)
from torch.utils.data import DataLoader

traindl = DataLoader(trainingdata, batch_size=60, shuffle=True)
testdl = DataLoader(test_data, batch_size=60, shuffle=True)
# Display image and label.
trainfeature, trainlabel = next(iter(traindl))
print(f"Feature Batch Shape: {trainfeature.size()}")
print(f"Label Batch Shape: {trainlabel.size()}")
imgdir = trainfeature[0].squeeze()
labels = trainlabel[0]
plot.imshow(imgdir, cmap="gray")
plot.show()
print(f"Labels: {labels}")

Output:

In the following output, we can see that the Dataloader can load the data from the directory and printed it on the screen.

PyTorch dataloader from directory
PyTorch dataloader from the directory

Read: PyTorch Load Model + Examples

READ:  Matplotlib set axis range

PyTorch dataloader train test split

In this section, we will learn about how the dataloader split the data into train and test in python.

The train test split is a process for calculating the performance of the model and seeing how accurate our model performs.

Code:

In the following code, we will import some libraries from which the dataloader can split the data into train and test.

  • transforms.Compose() is used to declare transform to convert raw data to tensor.
  • traindata,testdata = random_split(traindata,[50000,10000]) is used to splitting the data into train and test.
  • train_loader = DataLoader(traindata, batch_size=32) is used to create a dataloader with a batch size.
  • optimizers = torch.optim.SGD(models.parameters(), lr = 0.01) is used to initialize the optimizer.
  • print(f’Epoch {i+1} \t\t Training data: {trainloss / len(train_loader)} \t\t Test data: {testloss / len(test_loader)}’) is used to print the train and test data.
  • torch.save(models.state_dict(), ‘saved_model.pth’) is used to save the state dict.
import torch
from torch import nn
import torch.nn.functional as fun
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split
import numpy as num
 

transform = transforms.Compose([
                                 transforms.ToTensor()
])
 

traindata = datasets.MNIST('', train = True, transform = transform, download = True)
traindata,testdata = random_split(traindata,[50000,10000])

train_loader = DataLoader(traindata, batch_size=32)
test_loader = DataLoader(testdata, batch_size=32)
 
# Building Our Mode
class network(nn.Module):

    def __init__(self):
        super(network,self).__init__()
        self.fc = nn.Linear(28*28, 256)
        self.fc1 = nn.Linear(256, 128)
        self.fc2 = nn.Linear(128, 10)
 

    def forward(self, y):
        y = y.view(y.shape[0],-1)   
        y = fun.relu(self.fc(y))
        y = fun.relu(self.fc1(y))
        y = self.fc2(y)
        return y
 
models = network()
if torch.cuda.is_available():
    models = models.cuda()
 
criterions = nn.CrossEntropyLoss()
optimizers = torch.optim.SGD(models.parameters(), lr = 0.01)
 
epoch = 7
minvalid_loss = num.inf

for i in range(epoch):
    trainloss = 0.0
    models.train()     
    for data, label in train_loader:
        if torch.cuda.is_available():
            data, label = data.cuda(), label.cuda()
        
        optimizers.zero_grad()
        targets = models(data)
        loss = criterions(targets,label)
        loss.backward()
        optimizers.step()
        trainloss += loss.item()
    
    testloss = 0.0
    models.eval()    
    for data, label in test_loader:
        if torch.cuda.is_available():
            data, label = data.cuda(), label.cuda()
        
        targets = models(data)
        loss = criterions(targets,label)
        testloss = loss.item() * data.size(0)

    print(f'Epoch {i+1} \t\t Training data: {trainloss / len(train_loader)} \t\t Test data: {testloss / len(test_loader)}')
    if minvalid_loss > testloss:
        print(f'Test data Decreased({minvalid_loss:.6f}--->{testloss:.6f}) \t Saving The Model')
        minvalid_loss = testloss

        torch.save(models.state_dict(), 'saved_model.pth')

Output:

In the following output, we can see that the PyTorch Dataloader spit train test data is printed on the screen.

PyTorch dataloader train test split
PyTorch dataloader train test split

Read: PyTorch nn linear + Examples

PyTorch dataloader for text

In this section, we will learn about how the PyTorch dataloader works for text in python.

Dataloader combines the datasets and supplies the iteration over the given dataset. Dataset stores all the data and the dataloader is used to transform the data.

Code:

In the following code, we will import the torch module for loading the text from the dataloader.

  • trainiteration = AG_NEWS(split=’train’) is used to split the whole data into train data.
  • print(labels, lines) is used to print the labels and lines.
  • dataloaders = DataLoader(trainiteration, batch_size=5, shuffle=False) is used to load the data.
from torchtext.datasets import AG_NEWS
trainiteration = AG_NEWS(split='train')
# Iterate with for loop
for (labels, lines) in trainiteration:
       print(labels, lines)
# send to DataLoader
from torch.utils.data import DataLoader
trainiteration = AG_NEWS(split='train')
dataloaders = DataLoader(trainiteration, batch_size=5, shuffle=False)

Output:

After running the above code we get the following output in which we can see that the PyTorch dataloader for text data is printed on the screen.

PyTorch dataloader for text
PyTorch dataloader for text

Read: Adam optimizer PyTorch with Examples

PyTorch dataloader Cuda

In this section, we will learn about the PyTorch dataloader Cuda in python.

Before moving forward we should have some piece of knowledge about Cuda. Cuda is an application programming interface that permits the software to use a certain type of GPU.

Code:

In the following code, we will import a torch module from which we can load the data through the dataloader.

  • traindata = datasets.MNIST() is used as a training dataset.
  • trainloader = DataLoader() is used to load the train data.
  • traindata.train_data.to(torch.device(“cuda:0”)) is used to put the data into GPU entirely.
import torch
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms
batchsize = 60
transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1304,), (0.3080,))
])
traindata = datasets.MNIST(
    root='./dataset/minst/',
    train=True,
    download=False,
    transform=transforms
)
trainloader = DataLoader(
    dataset=traindata,
    shuffle=True,
    batch_size=batchsize
)
traindata.train_data.to(torch.device("cuda:0"))  
traindata.train_labels.to(torch.device("cuda:0"))

Output:

After running the above code, we get the following output in which we can see that the data is put into GPU and loaded on the screen with the help of a dataloader.

PyTorch dataloader Cuda
PyTorch dataloader Cuda

Read: Cross Entropy Loss PyTorch

PyTorch dataloader num_workers

In this section, we will learn about the PyTorch dataloader num_workers in python.

The num_workersis defined as the process that donates the number of processes that create batches.

Code:

In the following code, we will import some modules from which dataloader num_workers create baches.

  • transforms.Compose() is used to define transform to normalize the data.
  • train_loader = torch.utils.data.DataLoader(train_set, batch_size=60, shuffle=True) from torch.utils.data import Dataset is used to load the training data.
  • datasets=SampleDataset(2,440) is used to create the sample dataset.
  • dloader = DataLoader(datasets,batch_size=10, shuffle=True, num_workers=4 ) is used to load the batches.
  • print(x, batch) is used to print the batches.
import torch
import matplotlib.pyplot as plot
from torchvision import datasets, transforms

transforms = transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5,), (0.3,)),
                              ])

train_set = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transforms)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=60, shuffle=True)
from torch.utils.data import Dataset
import random
 
class SampleDataset(Dataset):
  def __init__(self,r,r1):
    random_list=[]
    for x in range(2,999):
      m = random.randint(r,r1)
      random_list.append(m)
    self.samples=random_list
 
  def __len__(self):
      return len(self.samples)
 
  def __getitem__(self,idx):
      return(self.samples[idx])
 
datasets=SampleDataset(2,440)
datasets[90:100]
from torch.utils.data import DataLoader
dloader = DataLoader(datasets,batch_size=10, shuffle=True, num_workers=4 )
for x, batch in enumerate(dloader):
        print(x, batch)

Output:

READ:  Write a Program to Find a Perfect Number in Python

After running the above code, we get the following output in which we can see that the PyTorch dataloader num_workers data are printed on the screen.

PyTorch dataloader num_worker
PyTorch dataloader num_worker

Read: PyTorch Tensor to Numpy

Pytorch dataloder add dimension

In this section, we will learn about How PyTorch dataloader can add dimensions in python.

The dataloader in PyTorch seems to add some additional dimensions after the batch dimension.

Code:

In the following code, we will import the torch module from which we can add a dimension.

  • mnisttrain_data = MNIST(root=’./data’, train=True, download=True, transform=transforms.Compose([transforms.ToTensor()])) is used to load the data from mnist dataset.
  • print(y.shape) is used to print the shape of y.
  • trainloader_data = torch.utils.data.DataLoader(mnisttrain_data, batch_size=150) is used to load the train data.
  • batch_y, batch_z = next(iter(trainloader_data)) is used to get the first batch.
  • print(batch_y.shape) is used to print the shape of batch.
import torch
from torchvision.datasets import MNIST
from torchvision import transforms

if __name__ == '__main__':
    mnisttrain_data = MNIST(root='./data', train=True, download=True, transform=transforms.Compose([transforms.ToTensor()]))
    y = mnisttrain_data.data[1]
    print(y.shape)  

    trainloader_data = torch.utils.data.DataLoader(mnisttrain_data, batch_size=150)
    batch_y, batch_z = next(iter(trainloader_data))  
    print(batch_y.shape)

Output:

After running the above code, we get the following output in which we can see that the dimension will be added.

PyTorch dataloader add dimension
PyTorch dataloader add dimension

Read: PyTorch Batch Normalization

PyTorch dataloader access dataset

In this section, we will learn about How to access datasets with the help of a dataloader in python.

Datasets enclosed an iterable around the datasets that authorize access to the samples.

Code:

In the following code, we will import the torch module from which the dataloader can access the datasets.

  • figure_data = plot.figure(figsize=(9,9)) is used to plot the figures.
  • sample_index = torch.randint(len(traindata), size=(1,)).item() is used to generate the random sample items.
  • figure_data.add_subplot(rows, columns, x) is used to add some subplots.
  • plot.title(labelmap_dt[labels]) is used to plot the title on the screen.
  • plot.imshow(image.squeeze(), cmap=”gray”) is used to show the images on the screen.
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plot


traindata = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

testdata = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)
labelmap_dt = {
    0: "Skirt",
    1: "Top",
    2: "Muffler",
    3: "Western_dress",
    4: "Pant_coat",
    5: "Snikers",
    6: "T-shirt",
    7: "Sandles",
    8: "Purse",
    9: "Shoes",
}
figure_data = plot.figure(figsize=(9,9))
columns, rows = 4, 4
for x in range(1, columns * rows + 1):
    sample_index = torch.randint(len(traindata), size=(1,)).item()
    image, labels = traindata[sample_index]
    figure_data.add_subplot(rows, columns, x)
    plot.title(labelmap_dt[labels])
    plot.axis("off")
    plot.imshow(image.squeeze(), cmap="gray")
plot.show()

Output:

The following output shows that the dataloader can access the dataset and the images and labels are plotted on the screen.

PyTorch dataloader access dataset
PyTorch dataloader access dataset

Read: PyTorch MSELoss ā€“ Detailed Guide

PyTorch dataloader batch size

In this section, we will learn how to load the batch size with the help of dataloader in python.

Batch size is defined as the number of samples processed before the model is updated. The batch size is equal to the number of samples in the training data.

Code:

In the following code, we will import the torch module from which we can process the number of samples before the model is updated.

  • datasets = impdataset(1001) is used as a dataset.
  • dataloader = DataLoader(datasets,batch_size=15) is used to load the data with batch size15.
  • len(dataloader) is used as a length of dataloader.
  • print(‘batch index{}, batch length {}’.format(batchindex, len(data))) is used to print the batch index, batch length on the screen.
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
from torch.utils.data import DataLoader
class impdataset(Dataset):
    def __init__(self, siz):
        self.y = torch.randn(siz, 1)
    
    def __getitem__(self, idx):
        return self.y[idx]

    def __len__(self):
        return len(self.y)

datasets = impdataset(1001)

dataloader = DataLoader(datasets,batch_size=15)

len(dataloader)

for batchindex, data in enumerate(dataloader):
    print('batch index{}, batch length {}'.format(batchindex, len(data)))

Output:

After running the above code, we get the following output in which we can see that the dataloader batch size is printed on the screen.

PyTorch dataloader batch size
PyTorch dataloader batch size

Read: PyTorch MNIST Tutorial

PyTorch dataloader epoch

In this section, we will learn about how the PyTorch dataloader epoch works in python.

Before moving forward we should have a piece of knowledge about the epoch. An epoch is defined as a point where time starts.

Code:

In the following code, we will import the torch module from which we can load the dataloader epoch to start working.

  • traindt = datasets.MNIST(”, train = True, transform = transform, download = True) is used as a train data.
  • train_loader = DataLoader(traindt, batch_size=30) is used as a train loader.
  • test_loader = DataLoader(testdt, batch_size=30) is used as a test loader.
  • optimizers = torch.optim.SGD(models.parameters(), lr = 0.01) is used to initialize the optimizer.
  • optimizers.zero_grad() is used to optimize the zero gradient.
  • print(f’Epoch {x+1} \t\t Training data: {trainloss / len(train_loader)} \t\t Test data: {testloss / len(test_loader)}’) is used to print the train and test data on the screen.
import torch
from torch import nn
import torch.nn.functional as func
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split
import numpy as num
 

transform = transforms.Compose([
                                 transforms.ToTensor()
])
 

traindt = datasets.MNIST('', train = True, transform = transform, download = True)
traindt,testdt = random_split(traindt,[50000,10000])

train_loader = DataLoader(traindt, batch_size=30)
test_loader = DataLoader(testdt, batch_size=30)
 

class network(nn.Module):

    def __init__(self):
        super(network,self).__init__()
        self.fc = nn.Linear(28*28, 254)
        self.fc1 = nn.Linear(254, 126)
        self.fc2 = nn.Linear(126, 10)
 

    def forward(self, y):
        y = y.view(y.shape[0],-1)   
        y = func.relu(self.fc(y))
        y = func.relu(self.fc1(y))
        y = self.fc2(y)
        return y
 
models = network()
if torch.cuda.is_available():
    models = models.cuda()
 
criterions = nn.CrossEntropyLoss()
optimizers = torch.optim.SGD(models.parameters(), lr = 0.01)
 
epoch = 5
minvalid_loss = num.inf

for x in range(epoch):
    trainloss = 0.0
    models.train()     
    for data, label in train_loader:
        if torch.cuda.is_available():
            data, label = data.cuda(), label.cuda()
        
        optimizers.zero_grad()
        targets = models(data)
        loss = criterions(targets,label)
        loss.backward()
        optimizers.step()
        trainloss += loss.item()
    
    testloss = 0.0
    models.eval()    
    for data, label in test_loader:
        if torch.cuda.is_available():
            data, label = data.cuda(), label.cuda()
        
        targets = models(data)
        loss = criterions(targets,label)
        testloss = loss.item() * data.size(0)

    print(f'Epoch {x+1} \t\t Training data: {trainloss / len(train_loader)} \t\t Test data: {testloss / len(test_loader)}')

Output:

READ:  Python Turtle Random + Examples

In the following output, we can see that the training data and testing data with epoch are printed on the screen.

PyTorch dataloader epoch
PyTorch dataloader epoch

Read: PyTorch fully connected layer

PyTorch dataloader enumerate

In this section, we will learn about the PyTorch dataloader enumerate in python.

The Dataloader combines the dataset and supplies an iteration over the given dataset and the enumerate is defined as a process that mentions the number of things one by one.

Code:

In the following code, we will import the torch module from which we can enumerate the data.

  • num = list(range(0, 90, 2)) is used to define the list.
  • data_loader = DataLoader(dataset, batch_size=12, shuffle=True) is used to implementing the dataloader on the dataset and print per batch.
  • print(x, batch) is used to print the number of batch one by one.
import torch
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
  
# define the Dataset class
class dataset(Dataset):
    def __init__(self):
        num = list(range(0, 90, 2))
        self.data = num
  
    def __len__(self):
        return len(self.data)
  
    def __getitem__(self, idx):
        return self.data[idx]
  
  
dataset = dataset()
  

data_loader = DataLoader(dataset, batch_size=12, shuffle=True)
for x, batch in enumerate(data_loader):
    print(x, batch)

Output:

After running the above code, we get the following output in which we can see that the batches are printed one by one on the screen.

PyTorch dataloader enumerate
PyTorch dataloader enumerate

Read: PyTorch RNN ā€“ Detailed Guide

PyTorch dataloader dataframe

In this section, we will learn about how to load data using dataframe in python.

Dataframe is defined as a two-dimensional heterogeneous data structure with rows and columns and the Dataloader uses the data frame to load the data.

Code:

In the following code, we will import some libraries from which we can load the data.

  • dt = num.loadtxt(‘heart.csv’, delimiter=’,’, dtype=num.float32, skiprows=1) is used to load the csv file from the path.
  • self.a = torch.from_numpy(dt[:, :11]) is used as 11 column as a class label.
  • seconddata = dataset[1] is used to get the second sample and unpack.

import torch
import torchvision
from torch.utils.data import Dataset, DataLoader
import numpy as num
import math
  
class Dataset():
  
    def __init__(self):
        

        dt = num.loadtxt('heart.csv', delimiter=',',
                           dtype=num.float32, skiprows=1)
          
       
        self.a = torch.from_numpy(dt[:, :11])
        self.b = torch.from_numpy(dt[:, [11]])
        self.nsample = dt.shape[0] 
  
    def __getitem__(self, idx):
        return self.a[idx], self.b[idx]
        
    # we can call len(dataset) to return the size
    def __len__(self):
        return self.nsample
  
  
dataset = Dataset()
seconddata = dataset[1]
feturs, lbs = seconddata
print(feturs, lbs)

Output:

After running the above code, we get the following output in which we can see that the dataloader can load the data using data frames.

Pytorch dataloader dataframe
Pytorch dataloader dataframe

PyTorch dataloader batch sampler

In this section, we will learn about the PyTorch dataloader batch sampler in python.

The Dataloader has a sampler that is used internally to get the indices of each batch. The batch sampler is defined below the batch.

Code:

  • In the following code we will import the torch module from which we can get the indices of each batch.
  • data_set = batchsamplerdataset(xdata, ydata) is used to define the dataset.
  • batches_sampler = DataLoader(data_set).sampler is used to load the dataset.
  • batchsize = 2 is used to get the iterating over the sequential sampler.
  • defaultbatch_sampler = DataLoader(data_set, batch_size=batchsize).batch_sampler is used to load the batch sampler.
  • print(f’Batch #{i} indices: ‘, batch_indices) is used to print the batch and batch indices.
import torch
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
xdata = list(range(12))
ydata = list(range(12,22))
class batchsamplerdataset:
    def __init__(self, xdata, ydata):
        self.xdata = xdata
        self.ydata = ydata
    
    def __getitem__(self, n):
        return self.xdata[n], self.ydata[n]
    
    def __len__(self):
        return len(self.xdata)
data_set = batchsamplerdataset(xdata, ydata)
data_set[2]
batches_sampler = DataLoader(data_set).sampler
for x in batches_sampler:

    batchsize = 2
defaultbatch_sampler = DataLoader(data_set, batch_size=batchsize).batch_sampler
for i, batch_indices in enumerate(defaultbatch_sampler):
    print(f'Batch #{i} indices: ', batch_indices)

Output:

In the following output, we can see that the PyTorch dataloader batch sampler is printed on the screen.

Pytorch dataloader batch sampler
Pytorch dataloader batch sampler

So, in this tutorial, we discussed PyTorch dataloader and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • PyTorch dataloader
  • PyTorch dataloader example
  • PyTorch dataloader from the directory
  • PyTorch dataloader train test split
  • PyTorch dataloader for text
  • PyTorch dataloader Cuda
  • PyTorch dataloader num_workers
  • PyTorch dataloader access dataset
  • PyTorch dataloader batch size
  • PyTorch dataloader epoch
  • PyTorch dataloader enumerate
  • PyTorch dataloader dataframe
  • PyTorch dataloader batch sampler