When working with data in Python, I often encounter datasets with missing values. These values, represented as NaN (Not a Number) in pandas, can sometimes interfere with my analysis, especially when I need to identify unique values in a column.
Recently, I was analyzing a customer dataset for a US retail chain where I needed to extract unique store locations without including the NaN values that were cluttering my results.
In this article, I’ll show you several practical methods to get unique values from a pandas DataFrame column while excluding any missing (NaN) values.
Get Pandas Unique Values in Column Without NaN in Python
Now, I will explain how to get Pandas’ unique values in a column without NaN in Python.
Method 1: Use dropna() with unique()
The easiest approach to get unique values without NaN is to combine two pandas functions: dropna() and unique() in Python.
import pandas as pd
import numpy as np
# Creating a sample DataFrame with NaN values
data = {
'State': ['California', 'New York', np.nan, 'Texas', 'California', np.nan, 'Florida']
}
df = pd.DataFrame(data)
# Getting unique values without NaN
unique_states = df['State'].dropna().unique()
print(unique_states)Output:
['California' 'New York' 'Texas' 'Florida']You can see the output in the screenshot below.

In this example, I first removed all NaN values from the ‘State’ column using dropna() and then applied unique() to get only the distinct values. This simple two-step process gives you exactly what you need, unique values with no NaN entries.
Check out Python Pandas Write to Excel
Method 2: Use pandas Series.value_counts()
Another effective approach is using the Python value_counts() method that automatically excludes NaN values by default.
# Using value_counts()
unique_states_with_counts = df['State'].value_counts()
print(unique_states_with_counts)Output:
State
California 2
New York 1
Texas 1
Florida 1
Name: count, dtype: int64You can see the output in the screenshot below.

The value_counts() method gives us an added advantage, not only does it exclude NaN values, but it also provides the frequency of each value. This is particularly useful when I need to know both unique values and their counts.
To get just the unique values from this result, I can access the index:
unique_states = df['State'].value_counts().index.tolist()
print(unique_states)Output:
['California', 'New York', 'Texas', 'Florida']Method 3: Use notnull() with unique()
A third approach combines Python notnull() with unique() method to filter out NaN values:
# Using notnull with unique
unique_states = df.loc[df['State'].notnull(), 'State'].unique()
print(unique_states)Output:
['California' 'New York' 'Texas' 'Florida']You can see the output in the screenshot below.

This method first creates a boolean mask using notnull() to identify non-NaN values, then applies unique() to get the distinct values. I find this approach particularly readable and explicit in its intention.
Check out Create Plots Using Pandas crosstab() in Python
Method 4: Use Python’s set() function
If you prefer using Python’s built-in functions, you can convert the non-NaN values to a set:
# Using Python's set() (first dropping NaN values)
unique_states = set(df['State'].dropna())
print(unique_states)Output:
{'California', 'Florida', 'New York', 'Texas'}Note that when using set(), the order of elements is not preserved. If the order matters for your analysis, you should stick with one of the previous methods.
Read the Drop the Header Row of Pandas DataFrame
Method 5: Use nunique() to count unique non-NaN values
If you only need the count of unique non-NaN values (rather than the values themselves), pandas provides the convenient nunique() method in Python:
# Count unique non-NaN values
unique_count = df['State'].nunique()
print(f"Number of unique states: {unique_count}")Output:
Number of unique states: 4This is extremely useful for quick data profiling when you just need to know how many unique values exist in a column.
Real-World Example: Analyze US Sales Data
Let’s look at a more comprehensive example using a sales dataset:
# Creating a more realistic dataset
sales_data = {
'State': ['California', 'New York', np.nan, 'Texas', 'California', np.nan, 'Florida'],
'Sales': [5000, 4200, 3800, 4500, 5600, 3200, 4100],
'Category': ['Electronics', 'Furniture', np.nan, 'Electronics', 'Clothing', 'Furniture', np.nan]
}
sales_df = pd.DataFrame(sales_data)
# Get unique product categories without NaN
unique_categories = sales_df['Category'].dropna().unique()
# Get states with sales > 4000 (excluding states with NaN)
high_sales_states = sales_df.loc[(sales_df['Sales'] > 4000) & (sales_df['State'].notnull()), 'State'].unique()
print("Unique Product Categories:")
print(unique_categories)
print("\nStates with High Sales:")
print(high_sales_states)Output:
Unique Product Categories:
['Electronics' 'Furniture' 'Clothing']
States with High Sales:
['California' 'New York' 'Texas']In this example, I’m analyzing a retail dataset where both ‘State’ and ‘Category’ columns contain NaN values. I’ve extracted unique product categories excluding NaNs, and also identified states with high sales while filtering out any NaN state values.
Check out Pandas Get Index of Row in Python
Performance Considerations
When working with large datasets, performance becomes crucial. Here’s a quick comparison of the methods we’ve discussed:
dropna().unique()– Simple and readable, good for most use casesvalue_counts().index– Efficient when you need counts tooloc[notnull(), ...].unique()– Very explicit but slightly more verboseset(dropna())– Can be faster for very large datasets due to the set’s O(1) lookup time
For most everyday use cases, the first method (dropna().unique()) offers the best balance of readability and performance.
Read Create Pandas Crosstab Percentage in Python
Conclusion
Handling NaN values properly is an important skill when working with pandas DataFrames. The methods I’ve shared in this article have served me well in countless real-world data analysis projects.
To summarize:
- Use
df['column'].dropna().unique()for an easy approach - Consider
df['column'].value_counts()when you need frequency information - Try
df.loc[df['column'].notnull(), 'column'].unique()for explicit filtering - Use
set(df['column'].dropna())when working with very large datasets - Apply
df['column'].nunique()when you only need the count
I also covered a real-world example to analyze US sales data and some performance considerations.
Other Python articles you may also like:
- Pandas drop_duplicates() Function in Python
- Remove All Non-Numeric Characters in Pandas
- Drop Non-Numeric Columns From Pandas DataFrame

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.