While working with data in Python, the pandas library is an indispensable tool that I’ve relied on for years. One of the most common operations in data cleaning and preparation is removing unwanted rows or columns from your dataset.
This is where the DataFrame drop() function comes into play. Whether you’re dealing with missing values, redundant features, or simply need to reshape your data.
In this article, I’ll walk you through everything you need to know about the pandas DataFrame drop() function.
DataFrame drop() Function
The drop() function in Python is a useful method in pandas that allows you to remove rows or columns from a DataFrame. It’s like having a precise scalpel that lets you surgically remove the exact parts of your data that you don’t need.
The beauty of drop() is its flexibility; you can remove single or multiple rows/columns, use labels or indices, and even specify how the operation should be performed.
Basic Syntax of the drop() Function
Before getting into specific examples, let’s understand the basic syntax:
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')The key parameters include:
labels: Index labels to dropaxis: 0 for rows, 1 for columnsindex/columns: Alternative to specifying labels and axisinplace: Whether to modify the DataFrame directly or return a copyerrors: ‘raise’ to throw an error if labels don’t exist, ‘ignore’ to ignore them
Method 1 – Drop Columns from a DataFrame
One of the most common uses of Python drop() is removing unnecessary columns from your dataset. Let me show you how this works with a practical example:
import pandas as pd
# Sample sales data
sales_data = pd.DataFrame({
'Date': ['2023-01-15', '2023-01-16', '2023-01-17', '2023-01-18', '2023-01-19'],
'Store': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],
'Product': ['Laptop', 'Phone', 'Tablet', 'Laptop', 'Phone'],
'Units_Sold': [12, 25, 18, 30, 15],
'Revenue': [14400, 20000, 9000, 36000, 12000],
'Notes': [None, 'Promotion', None, 'Holiday Sale', None]
})
# Drop single column
df_no_notes = sales_data.drop('Notes', axis=1)
print("After dropping 'Notes':", df_no_notes.columns.tolist())
# Drop multiple columns
df_simplified = sales_data.drop(['Notes', 'Store'], axis=1)
print("After dropping 'Notes' and 'Store':", df_simplified.columns.tolist())
# Drop using 'columns' parameter
df_sales_only = sales_data.drop(columns=['Notes', 'Store', 'Product'])
print("After keeping only sales data:", df_sales_only.columns.tolist())Output:
After dropping 'Notes': ['Date', 'Store', 'Product', 'Units_Sold', 'Revenue']
After dropping 'Notes' and 'Store': ['Date', 'Product', 'Units_Sold', 'Revenue']
After keeping only sales data: ['Date', 'Units_Sold', 'Revenue']I executed the above example code and added the screenshot below.

When working with real datasets, I often need to drop columns that contain redundant information or aren’t relevant to my analysis. Using the axis=1 parameter (or the columns parameter) makes it clear that we’re operating on columns rather than rows.
Read Convert a DataFrame to JSON in Python
Method 2 – Drop Rows from a DataFrame
Removing rows is just as common as removing columns. Here’s how to do it:
# Drop a single row by index
df_without_row = sales_data.drop(0, axis=0)
print("After dropping row at index 0:", df_without_row.index.tolist())
# Drop multiple rows by index
df_without_multiple = sales_data.drop([1, 3], axis=0)
print("After dropping rows at index 1 and 3:", df_without_multiple.index.tolist())
# Drop a row using custom index (by date)
sales_data_indexed = sales_data.set_index('Date')
df_by_date = sales_data_indexed.drop('2023-01-17')
print("After dropping date '2023-01-17':", df_by_date.index.tolist())Output:
After dropping row at index 0: [1, 2, 3, 4]
After dropping rows at index 1 and 3: [0, 2, 4]
After dropping date '2023-01-17': ['2023-01-15', '2023-01-16', '2023-01-18', '2023-01-19']I executed the above example code and added the screenshot below.

In my experience, dropping rows is particularly useful when dealing with outliers or specific records that could skew your analysis. The default axis value is 0, so you can omit it when dropping rows, but I recommend including it for code clarity.
Method 3 – Use inplace Parameter for Direct Modification
Sometimes you want to modify your original DataFrame directly instead of creating a new one:
import pandas as pd
# Sample sales data
sales_data = pd.DataFrame({
'Date': ['2023-01-15', '2023-01-16', '2023-01-17'],
'Product': ['Laptop', 'Phone', 'Tablet'],
'Revenue': [14400, 20000, 9000]
})
# Without inplace - original DataFrame remains unchanged
df_copy = sales_data.drop(0) # drops row at index 0 from a copy
print("Original DataFrame (unchanged):")
print(sales_data)
# With inplace=True - original DataFrame is modified
sales_data.drop(0, inplace=True) # modifies sales_data by dropping row at index 0
print("\nModified DataFrame (row 0 dropped):")
print(sales_data)Output:
Original DataFrame (unchanged):
Date Product Revenue
0 2023-01-15 Laptop 14400
1 2023-01-16 Phone 20000
2 2023-01-17 Tablet 9000
Modified DataFrame (row 0 dropped):
Date Product Revenue
1 2023-01-16 Phone 20000
2 2023-01-17 Tablet 9000I executed the above example code and added the screenshot below.

I’ve found that using inplace=True can be memory-efficient when working with large datasets, but be careful, this operation can’t be undone. I typically make a backup of my DataFrame before using inplace operations.
Check out Convert a DataFrame to JSON Array in Python
Method 4 – Conditional Dropping with Boolean Indexing
While not directly using the drop() function, you can achieve similar results with boolean indexing:
# Creating a new DataFrame with US temperature data
weather_data = pd.DataFrame({
'City': ['New York', 'Los Angeles', 'Chicago', 'Miami', 'Denver'],
'Temperature': [45, 75, 32, 85, 50],
'Humidity': [65, 45, 70, 80, 30],
'Precipitation': [0.1, 0.0, 0.2, 0.5, 0.0]
})
# Keep only rows where Temperature is above 40
warm_cities = weather_data[weather_data['Temperature'] > 40]
# This is equivalent to dropping rows where Temperature <= 40
# warm_cities = weather_data.drop(weather_data[weather_data['Temperature'] <= 40].index)This approach is incredibly useful when you need to filter data based on specific conditions. I use this method frequently when cleaning datasets to remove records that don’t meet certain criteria.
Method 5 – Drop Rows with Missing Values
Python’s drop() function works well with pandas’ built-in methods for handling missing values:
# Sample DataFrame with missing values in US stock data
stock_data = pd.DataFrame({
'Date': ['2023-01-15', '2023-01-16', '2023-01-17', '2023-01-18', '2023-01-19'],
'Symbol': ['AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA'],
'Open': [170.33, 142.65, 239.05, 96.43, None],
'Close': [171.22, None, 240.22, 97.25, 122.40],
'Volume': [70521698, 31425612, None, 64633575, 177207916]
})
# Drop rows with any missing values
clean_data = stock_data.dropna()
# This is equivalent to using drop() on rows with NaN values
# clean_data = stock_data.drop(stock_data[stock_data.isna().any(axis=1)].index)When preparing data for machine learning models or statistical analysis, I often need to decide how to handle missing values. The dropna() method is a convenient shortcut, but understanding how it relates to drop() helps you gain more control over your data cleaning process.
Read pd.crosstab Function in Python
Method 6 – Drop Duplicates
Similar to handling missing values, pandas provides a method to drop duplicate rows:
# Sample DataFrame with duplicate entries in US customer data
customer_data = pd.DataFrame({
'ID': [101, 102, 103, 102, 104],
'Name': ['John Smith', 'Mary Johnson', 'Robert Brown', 'Mary Johnson', 'Susan Davis'],
'State': ['NY', 'CA', 'TX', 'CA', 'FL']
})
# Drop duplicate rows (keeps first occurrence)
unique_customers = customer_data.drop_duplicates()
# Drop duplicates based on specific columns
unique_by_state = customer_data.drop_duplicates(subset=['State'])This is another specialized use case related to the drop() function. When analyzing customer data or transaction records, removing duplicates is often a critical preprocessing step.
Check out np.where in Pandas Python
Best Practices for Using drop()
Over my years of working with pandas, I’ve developed some best practices for using the drop() function effectively:
- Always verify your data before and after dropping: Use
.head(),.shape, or.info()to confirm you’ve removed the intended elements. - Be cautious with
inplace=True: It modifies your original data and can’t be undone. Consider creating a copy first. - Use explicit axis naming: Even though axis=0 is the default for rows, explicitly stating the axis makes your code more readable.
- Consider using the errors=’ignore’ parameter: This prevents your code from crashing if you try to drop labels that don’t exist.
- Chain methods efficiently: Instead of dropping elements in multiple steps, you can often chain operations for cleaner code.
The drop() function in pandas is an essential tool for data manipulation. Whether you’re cleaning messy datasets, focusing your analysis on specific variables, or preparing data for visualization, mastering drop() will significantly improve your data workflow.
While there are specialized methods like dropna() and drop_duplicates() for specific scenarios, understanding the core drop() function gives you more flexibility and control over your data processing pipeline.
I hope this guide helps you use the pandas drop() function more effectively in your Python projects.
Pandas related tutorial:
- Pandas GroupBy Without Aggregation Function in Python
- Pandas Merge Fill NAN with 0 in Python
- Python Pandas Write to Excel

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.