As a developer working on a data analysis project in Python for one of my clients, I often encounter datasets that contain a mix of numeric and non-numeric columns. Sometimes, I need to perform calculations that only work with numeric data, and in these situations, I need to filter out all non-numeric columns from my DataFrame.
In this article, I’ll share three effective methods to drop non-numeric columns from a Pandas DataFrame. These techniques have saved me countless hours in my data analysis projects, and I’m confident they’ll help you too.
Let’s get in!
Method 1: Use select_dtypes() to Keep Only Numeric Columns
The simplest approach to drop non-numeric columns is to use the select_dtypes() method in Python. This method allows you to filter columns based on their data types.
Let’s start with a simple example:
import pandas as pd
import numpy as np
# Creating a sample DataFrame with mixed column types
data = {
'Name': ['John Smith', 'Sarah Johnson', 'Mike Williams', 'Emily Davis'],
'Age': [32, 28, 45, 36],
'Salary': [75000, 82000, 95000, 67000],
'Department': ['Marketing', 'IT', 'Finance', 'HR'],
'Performance_Score': [4.2, 3.8, 4.5, 4.0],
'Date_Joined': ['2019-05-12', '2020-02-15', '2017-11-01', '2021-08-23']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df.dtypes)
print(df.head())
# Keep only numeric columns
numeric_df = df.select_dtypes(include=['number'])
print("\nDataFrame with only numeric columns:")
print(numeric_df.dtypes)
print(numeric_df.head())In this example, the select_dtypes(include=['number']) method returns a new DataFrame containing only the numeric columns. The output would look like:
Original DataFrame:
Name object
Age int64
Salary int64
Department object
Performance_Score float64
Date_Joined object
dtype: object
DataFrame with only numeric columns:
Age int64
Salary int64
Performance_Score float64
dtype: objectThis method is clean and efficient, perfect for quick data preprocessing.
Method 2: Use pd.to_numeric() with Errors=’coerce’
Another approach I frequently use involves the pd.to_numeric() function in Python combined with some DataFrame manipulation. This method is particularly useful when you want to try converting columns to a numeric format before deciding whether to drop them.
Here’s how to implement it:
import pandas as pd
# Sample DataFrame
data = {
'A': [1, 2, 3, 4],
'B': ['5', '6', '7', '8'],
'C': ['a', 'b', 'c', 'd'],
'D': [10.5, 11.2, 12.8, 9.7],
'E': ['10.5', '11.2', 'twelve', '9.7']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df.dtypes)
# Function to check if a column can be converted to numeric
def is_numeric(column):
# Try to convert to numeric, with errors='coerce'
numeric_column = pd.to_numeric(column, errors='coerce')
# Check if the column has any non-NaN values after conversion
return not numeric_column.isna().all()
# Filter columns
numeric_columns = [col for col in df.columns if is_numeric(df[col])]
numeric_df = df[numeric_columns]
print("\nDataFrame with only numeric columns:")
print(numeric_df.dtypes)The output would show:
Original DataFrame:
A int64
B object
C object
D float64
E object
dtype: object
DataFrame with only numeric columns:
A int64
B object
D float64
E object
dtype: objectI executed the above example code and added the screenshot below.

In this case, columns A, B, D, and E are kept because they can be converted to numeric types, while column C is dropped since it contains only non-numeric values.
Note that columns B and E are still object types in the result, but they contain values that can be converted to numbers. If you want to convert them, you can add step:
# Convert all remaining columns to numeric types
for col in numeric_df.columns:
numeric_df[col] = pd.to_numeric(numeric_df[col], errors='coerce')Method 3: Use DataFrame.drop() with Custom Function
The third method involves creating a custom function to identify non-numeric columns and then using the drop() method to remove them:
import pandas as pd
import numpy as np
# Creating a sample DataFrame
data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Monitor'],
'Price': [1200, 800, 350, 250],
'Stock': [45, 120, 75, 30],
'Category': ['Electronics', 'Electronics', 'Electronics', 'Accessories'],
'Rating': [4.5, 4.8, 4.2, 4.0],
'Last_Updated': ['2023-01-15', '2023-02-10', '2023-01-28', '2023-02-05']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df.dtypes)
print(df.head())
# Function to check if a column is numeric
def is_non_numeric_column(df, column):
return not np.issubdtype(df[column].dtype, np.number)
# Get list of non-numeric columns
non_numeric_cols = [col for col in df.columns if is_non_numeric_column(df, col)]
# Drop non-numeric columns
numeric_df = df.drop(columns=non_numeric_cols)
print("\nDataFrame after dropping non-numeric columns:")
print(numeric_df.dtypes)
print(numeric_df.head())The output:
Original DataFrame:
Product object
Price int64
Stock int64
Category object
Rating float64
Last_Updated object
dtype: object
DataFrame after dropping non-numeric columns:
Price int64
Stock int64
Rating float64
dtype: objectI executed the above example code and added the screenshot below.

This method gives you more control over the filtering process and can be customized based on specific requirements.
Read Convert a DataFrame to JSON Array in Python
Bonus Method: Use DataFrame.describe() to Identify Numeric Columns
Here’s a bonus method that leverages the fact that describe() by default only includes numeric columns:
import pandas as pd
# Sample DataFrame
data = {
'Employee': ['John Smith', 'Sarah Johnson', 'Robert Brown'],
'Department': ['Sales', 'Marketing', 'IT'],
'Years_Employed': [5, 3, 7],
'Salary': [65000, 58000, 72000],
'Performance': [0.85, 0.92, 0.78]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df.head())
# Get numeric columns using describe()
numeric_columns = df.describe().columns
numeric_df = df[numeric_columns]
print("\nDataFrame with only numeric columns:")
print(numeric_df.head())This method is quick and elegant, although it provides less flexibility than the previous approaches.
Check out Convert a DataFrame to JSON in Python
Real-World Application: Data Preprocessing for Machine Learning
Let’s apply what we’ve learned to a more practical scenario. Suppose we’re preparing a dataset for a machine learning model that predicts housing prices in California:
import pandas as pd
import numpy as np
# Sample California housing dataset
data = {
'Address': ['123 Main St, San Francisco', '456 Oak Ave, Los Angeles', '789 Pine Rd, San Diego'],
'Zip_Code': ['94102', '90001', '92101'],
'Price': [1250000, 950000, 875000],
'Bedrooms': [3, 4, 3],
'Bathrooms': [2.5, 3.0, 2.0],
'Square_Feet': [1850, 2200, 1650],
'Year_Built': [1985, 2002, 1992],
'Neighborhood': ['Downtown', 'Hollywood', 'Gaslamp'],
'School_Rating': [8.5, 7.2, 8.9]
}
housing_df = pd.DataFrame(data)
print("Original Housing DataFrame:")
print(housing_df.head())
# Machine learning models typically need numeric features
# Method 1: Using select_dtypes()
numeric_housing_df = housing_df.select_dtypes(include=['number'])
print("\nNumeric Housing Data for ML Model:")
print(numeric_housing_df.head())
# We might want to keep the Zip_Code as it could be relevant
# Let's try to convert it to numeric
housing_df['Zip_Code'] = pd.to_numeric(housing_df['Zip_Code'], errors='coerce')
# Now get all numeric columns
numeric_columns = [col for col in housing_df.columns if not pd.api.types.is_object_dtype(housing_df[col])]
final_housing_df = housing_df[numeric_columns]
print("\nFinal Housing Data for ML Model (including Zip Code):")
print(final_housing_df.head())In this example, we’ve filtered out non-numeric columns from a housing dataset, which is a common preprocessing step before training models in machine learning.
Handling Special Cases
Sometimes, you might encounter columns that look numeric but are stored as strings, or columns with mixed numeric and non-numeric values. Here’s how to handle these special cases:
import pandas as pd
# DataFrame with mixed types
data = {
'A': ['1', '2', '3', '4'], # Strings that look like numbers
'B': ['1.5', '2.5', 'three', '4.5'], # Mixed numeric and non-numeric
'C': [1, 2, 3, 4], # Pure numeric
'D': ['a', 'b', 'c', 'd'] # Pure non-numeric
}
df = pd.DataFrame(data)
print("Original DataFrame with mixed types:")
print(df.dtypes)
print(df.head())
# Try to convert all columns to numeric
for col in df.columns:
try:
df[col] = pd.to_numeric(df[col])
except ValueError:
# If column contains any non-numeric values, try coercing
numeric_values = pd.to_numeric(df[col], errors='coerce')
# If at least 75% of values can be converted to numeric, keep the column
if numeric_values.count() / len(numeric_values) >= 0.75:
df[col] = numeric_values
else:
# Mark for dropping
df[col] = None
# Drop columns that are now all None
df = df.dropna(axis=1, how='all')
print("\nDataFrame after handling special cases:")
print(df.dtypes)
print(df.head())This approach allows you to handle columns that are mostly numeric but might contain a few non-numeric values.
I’ve found these methods invaluable in my data preprocessing workflows. The right choice depends on your specific needs:
- Use
select_dtypes()for quick and clean filtering - Use
pd.to_numeric()with error handling for more flexible conversion attempts - Use custom functions with
drop()when you need more control over the filtering criteria
Remember, dropping non-numeric columns is just one step in the data preprocessing pipeline. Always ensure that you’re not discarding important information that could be encoded differently (like one-hot encoding categorical variables) before training your models.
You may like to read:
- Pandas Find Index of Value in Python
- Pandas Replace Multiple Values in Python
- Pandas Iterrows in Python

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.