Python Pandas Drop Rows Example

While working with the Python dataset Engineers clean the dataset as per the requirement of the project. Drop function is often used to remove rows & columns that might not be useful for the project. The dataset we are using in this tutorial is downloaded from Kaggle. In this tutorial, we will learn about python pandas drop rows. Also, we will cover these topics.

  • Python Pandas Drop Function
  • Python pandas drop rows by index
  • Python pandas drop rows by condition
  • Python pandas drop rows with nan in specific column
  • Python pandas drop rows with nan
  • Python pandas drop rows based on column value
  • Python pandas drop rows containing string

Python Pandas Drop Function

Pandas drop is a function in Python pandas used to drop the rows or columns of the dataset. This function is often used in data cleaning. axis = 0 is referred as rows and axis = 1 is referred as columns.

Syntax:

Here is the syntax for the implementation of the pandas drop()

DataFrame.drop(
    labels=None, 
    axis=0, 
    index=None, 
    columns=None, 
    level=None, 
    inplace=False, 
    errors='raise'
)
OptionsExplanation
labelsSingle label or list-like
Index or Column labels to drop.
axisthe drop will remove provided axis, the axis can be 0 or 1.
axis = 0 refers to rows or index (verticals)
axis = 1 refers to columns (horizontals)
by default, axis = 0
indexsingle label or list-like.
the index is the row (verticals) & is equivalent to axis=0
columnsSingle label or list-like.
the columns are horizontals in the tabular view & are denoted with axis=1.
levelint or level name, optional
For MultiIndex, the level from which the labels will be removed.
inplaceaccepts bool (True or False), default is False
Inplace makes changes then & there. don’t need to assign a variable.
errorsthe error can be ‘ignored‘ or ‘raised‘. default is ‘raised’
if ignored suppress error and only existing labels are dropped
if raised then it will show the error message & won’t allow dropping the data.

Also read, How to use Pandas drop() function in Python

Python pandas drop rows by index

  • In this section, we will learn how to drop rows by index in Python Pandas. To remove the rows by index all we have to do is pass the index number or list of index numbers in case of multiple drops.
  • to drop rows by index simply use this code: df.drop(index). Here df is the dataframe on which you are working and in place of index type the index number or name.
  • Here is the implementation of code on the jupyter notebook please do read the comments and markdown for step by step explanation.

Python pandas drop rows by condition

In this section, we will learn how to drop rows by condition in Python pandas. So there could be a n-number of conditions that can be applied depending upon the project.

Here is the implementation of dropping rows by condition on jupyter notebook. Read comments and markdowns to understand better.

Read, How to Drop Duplicates using drop_duplicates() function in Python Pandas

Python pandas drop rows with nan in specific column

  • In this section, we will learn how to drop rows with nan or missing value in specific column in Python pandas.
  • to remove missing values from the dataset dropna() function is used. But to remove from a specific column that we have to provide a subset value inside the dropna() function

Syntax:

Here is the syntax to remove missing values or nan from specific column(s)

# remove from single column
df.drop(subset='column_name')

# remove from multiple columns
df.drop(subset=['column1', column2, 'column3']

Here is the implementation on the jupyter notebook. Please refer to comments & markdowns for step by step explanation.

Python pandas drop rows with nan

  • In this section, we will learn how to drop rows with nan. nan is an abbreviation of ‘not a number’ and is referred to missing values of the dataset.
  • dropna() function is used to drop all the missing values from the dataset in Python pandas.

Here is the implementation of drop rows with nan on jupyter notebook. Please read the comments and markdowns for step by step explanation.

Python pandas drop rows based on column value

In this section, we will learn how to drop rows based on column value in Python Pandas. Here we can filter and remove the rows that do not match the criteria.

Here is the implementation on of drop rows based on column value on jupyter notebook.

Python pandas drop rows containing string

  • In this section, we will learn how to drop rows containing string in Python Pandas. So will drop all the rows of the columns that contain datatype as a string in it.
  • so all we have to do here is filter numeric columns out and delete the remaining columns.
  • df.select_dtypes(exclude='number') this code snippet will return all the columns which do not have int as datatype in it. If this command is passed in drop() then it will remove all the columns except those having numbers in it. So this is how we can drop string rows.

Here is the implementation on jupyter notebook please refer to comments and markdown from step by step explanation.

You may also like the following Python Pandas tutorials:

In this tutorial, we have learned about python pandas drop rows. Also, we have covered these topics.

  • python pandas drop rows by index
  • python pandas drop rows by condition
  • python pandas drop rows with nan in specific column
  • python pandas drop rows with nan
  • python pandas drop rows based on column value
  • python pandas drop rows containing string