How to Subset a DataFrame in Python

In this Python tutorial, we will learn how to subset a DataFrame in Python. To understand various approaches, we’ll use some built-in functions to subset a DataFrame in Python.

As a Developer, while making the Python Project I got the requirement to subset a DataFrame in Python.

Here we will see:

  • How to Subset a DataFrame in Python using loc()
  • How to Subset a DataFrame in Python using iloc()
  • How to Subset a DataFrame in Python using an indexing operator
  • Select rows where the student_age is equal to or greater than 15

How to Subset a DataFrame in Python

In Python, there are primarily three methods that are commonly used and important to understand how to subset a dataframe in Python.

How to Subset a DataFrame in Python using loc()

  • In this section, we will discuss how to Subset a DataFrame in Pandas using loc().
  • Selecting a set of desirable rows and columns from a data frame is the procedure known as subsetting.
  • With the help of the loc() function in Python, we may create a subset of a data frame based on a particular row, column, or both.
  • The loc() function relies on labels to choose and generate the customized subset, thus we must provide it with the label of the row or column.
  • In this example first, we will create a dataframe for this we are going to use the pd.dataframe() function in Python.

Note: We must first create a dataframe before we can create subsets of it. First, let’s get it out of the way.

import pandas as pd

Student_info = {'Student_id': [672,345, 678,123,783],
         'Student_name': ['John','James','Potter','George','Micheal'],
         'Student_age': [17,15,14,12,11]
        }

df = pd.DataFrame(Student_info, columns= ['Student_id','Student_name','Student_age'])
print (df)

Here is the Screenshot of the following given code.

How to create a student dataframe in Python Pandas
How to create a student dataframe in Python Pandas

In this case, pandas were used to generate a data frame. DataFrame() technique.

By providing the labels of the columns and the index of the rows, the loc() method in Python can also be used to change the value of a row with respect to its columns.

Syntax:

Here is the Syntax of the loc() method in Python Pandas

dataframe.loc[row index,['column-names']] = value

Example:

Let’s take an example and check how to Subset a DataFrame in Python using loc().

Source Code:

result= df.loc[[0,1,3]]
print(result)

You can refer to the below Screenshot

How to Subset a DataFrame in Python using loc
How to Subset a DataFrame in Python using loc

This is how to Subset a DataFrame in Python using loc().

Read: Python Pandas CSV Tutorial

How to Subset a DataFrame in Python using iloc()

  • Now let us understand how to Subset a DataFrame in Pandas using iloc().
  • The iloc() method in Python allows us to construct subsets by selecting particular values based on indexes from rows and columns.
  • In other words, the iloc() function operates on index values as opposed to labels, as does the loc() function. Using the data and the index numbers of the rows and columns, we may pick and generate a subset of a Python dataframe.

Example:

Let’s take an example and check how to Subset a DataFrame in Python using iloc().

Source Code:

result= df.iloc[[0,1,3],[0,2]]
result

Here is the implementation of the following given code

How to Subset a DataFrame in Python using iloc
How to Subset a DataFrame in Python using iloc

As you can see in the Screenshot we have discussed how to Subset a DataFrame in Pandas using iloc().

Read: How to delete a column in pandas

How to Subset a DataFrame in Python using Indexing operator

  • In this section, we will discuss how to Subset a DataFrame in Pandas using an Indexing operator.
  • We may quickly build a subset of the data by using the indexing operator square brackets.
  • In Python, indexing is a technique used to refer to specific elements within an iterable by their position. In other words, depending on your requirements, you can directly access your preferred elements within an iterable and perform different operations.

Example:

Here we will take an example and check how to Subset a DataFrame in Python using an Indexing operator.

Source Code:

df[['Student_id','Student_name']]  

You can refer to the below Screenshot

How to Subset a DataFrame in Python using Indexing operator
How to Subset a DataFrame in Python using Indexing operator

In this example, we have understood how to Subset a DataFrame in Python using the Indexing operator.

Read: GroupBy in Python Pandas

Select rows where the student_age is equal or greater than 15

  • In this section, we will discuss how to select rows where the student_age is equal to or greater than 15.
  • To get all the rows where the student_age is equal to or greater than 15, we will use the loc() method. The loc() function relies on labels to choose and generate the customized subset, thus we must provide it with the label of the row or column.

Example:

Let’s take an example and check how to select rows where the student_age is equal to or greater than 15.

Source Code:

new_result = df.loc[df['Student_age'] >= 10]
print (new_result)

Here is the execution of the following given code

Select rows where the student_age is equal or greater than 15
Select rows where the student_age is equal or greater than 15

This is how to select rows where the student_age is equal to or greater than 15.

You may also like to read the following Python Pandas tutorials.

In this article, we have discussed how to subset a DataFrame in Python. And also we have covered the following given topics.

  • How to Subset a DataFrame in Python using loc()
  • How to Subset a DataFrame in Python using iloc()
  • How to Subset a DataFrame in Python using an indexing operator
  • Select rows where the student_age is equal to or greater than 15