In this Python Pandas tutorial, we will discuss, what is pandas drop() function in Python, the syntax of the Pandas drop() function. And also we will see how to use the Pandas drop() function in Python with a few examples.
Pandas drop()
The Pandas drop() function in Python is used to drop specified labels from rows and columns. Drop is a major function used in data science & Machine Learning to clean the dataset.
Pandas Drop() function removes specified labels from rows or columns. When using a multi-index, labels on different levels can be removed by specifying the level.
Overall, The drop()
function is used to remove specified rows or columns from a pandas DataFrame or Series.
Syntax of Python Pandas drop()
Here is the syntax for the Pandas drop() function.
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Options | Explanation |
---|---|
labels | Single label or list-like Index or Column labels to drop. |
axis | the drop will remove provided axis, the axis can be 0 or 1. axis = 0 refers to rows or index (verticals) axis = 1 refers to columns (horizontals) by default, axis = 0 |
index | single label or list-like. the index is the row (verticals) & is equivalent to axis=0 |
columns | Single label or list-like. the columns are horizontals in the tabular view & are denoted with axis=1. |
level | int or level name, optional For MultiIndex, the level from which the labels will be removed. |
inplace | accepts bool (True or False), default is False Inplace makes changes then & there. don’t need to assign a variable. |
errors | the error can be ‘ignored‘ or ‘raised‘. default is ‘raised’ if ignored suppress error and only existing labels are dropped if raised then it will show the error message & won’t allow dropping the data. |
Examples of Pandas drop()
Let’s look at some examples to better understand how to use the drop()
function in Pandas in Python.
Dropping Rows
Suppose we have the following DataFrame in Python:
import pandas as pd
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [25, 30, 35, 40],
'gender': ['F', 'M', 'M', 'M']
}
df = pd.DataFrame(data)
print(df)
Output:
name age gender
0 Alice 25 F
1 Bob 30 M
2 Charlie 35 M
3 David 40 M
To drop the row with index 1 (i.e., the row with ‘Bob’), we can use the following code:
df = df.drop(1)
print(df)
Output:
name age gender
0 Alice 25 F
2 Charlie 35 M
3 David 40 M
We can also drop multiple rows at once by specifying a list of indices to drop in Pandas Python:
df = df.drop([0, 2])
print(df)
Output:
name age gender
3 David 40 M
Dropping Columns
To drop a column in Python Pandas, we can set axis=1
:
df = df.drop('gender', axis=1)
print(df)
Output:
name age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
Again, we can drop multiple columns using Pandas at once by specifying a list of column names:
df = df.drop(['name', 'age'], axis=1)
print(df)
Output:
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3]
Modifying the DataFrame In-Place
By default, the drop()
function does not modify the original Pandas DataFrame. Instead, it returns a new DataFrame with the specified rows or columns dropped. If we want to modify the original DataFrame in place, we can set the ‘inplace=True
‘:
Example:
import pandas as pd
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [25, 30, 35, 40],
'gender': ['F', 'M', 'M', 'M']
}
df = pd.DataFrame(data)
print("Original DataFrame:\n", df)
# drop row with index 1 in-place
df.drop(1, inplace=True)
print("\nModified DataFrame:\n", df)
Output:
Original DataFrame:
name age gender
0 Alice 25 F
1 Bob 30 M
2 Charlie 35 M
3 David 40 M
Modified DataFrame:
name age gender
0 Alice 25 F
2 Charlie 35 M
3 David 40 M
As you can see, the row with index 1 (‘Bob’) has been dropped from the original DataFrame in-place. It’s important to note that when using inplace=True
, the function returns None
and does not create a new DataFrame object.
Conclusion
The drop()
function in the Python pandas library is a very useful tool for removing specified rows or columns from a DataFrame or Series. The function takes in several parameters, including the labels to drop, the axis (i.e., rows or columns), and whether or not to modify the original DataFrame in-place.
With the drop()
function, we can easily manipulate the structure of our data by removing unnecessary rows or columns. We can also chain multiple drop()
functions together to remove multiple rows or columns at once.
It’s important to note that when using the drop()
function with inplace=True
, the function modifies the original DataFrame in-place and does not return a new DataFrame object. This can be useful when we want to save memory or avoid creating unnecessary copies of our data.
Overall, the drop()
function is a powerful tool that can help us clean and manipulate our data in Python pandas.
You may like the following Python Pandas tutorials:
- Create Plots using Pandas crosstab() in Python
- Percentage Normalization using Crosstab() in Pandas
- aggregate numeric values using crosstab() in Pandas Python
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.