How To Count Rows With Conditions In Pandas

Working with data often requires filtering and counting specific information. Recently, I was analyzing US sales data and needed to count the number of transactions that exceeded $1,000. This is a common requirement when processing large datasets.

Pandas makes this task simple with several different approaches.

In this article, I’ll share nine practical ways to count rows matching specific Pandas conditions. Whether you’re a data analyst, scientist, or Python developer, these techniques will help you extract meaningful insights from your data.

This Tutorial Covers:

Count Rows With Conditions in Pandas

Now, I will explain how to count rows with conditions in Python Pandas.

Method 1: Use sum() with Boolean Indexing

The simplest way to count rows with a condition is to use Boolean indexing with the sum() function in Python.

import pandas as pd

# Sample US sales data
data = {
    'State': ['California', 'Texas', 'New York', 'Florida', 'Illinois', 'California', 'Texas'],
    'Amount': [1200, 850, 1500, 950, 1100, 750, 2000],
    'Category': ['Electronics', 'Furniture', 'Electronics', 'Clothing', 'Electronics', 'Furniture', 'Electronics']
}

df = pd.DataFrame(data)

# Count rows where Amount > 1000
count = (df['Amount'] > 1000).sum()
print(f"Transactions over $1,000: {count}")

Output:

Transactions over $1,000: 4

I executed the above example code and added the screenshot below.

The boolean expression df['Amount'] > 1000 creates a series of True/False values, and sum() counts all the True values.

Method 2: Use len() with Boolean Indexing

Another approach is to use the len() function in Python with filtered data:

# Count rows where the state is California
count = len(df[df['State'] == 'California'])
print(f"Sales from California: {count}")

Output:

Sales from California: 2

I executed the above example code and added the screenshot below.

pandas dataframe count number of rows with condition

I prefer this method when I need to inspect the filtered rows before counting them.

Read Fix “Function Not Implemented for This Dtype” Error in Python

Method 3: Use value_counts() for Categorical Data

Python value_counts() method is perfect for counting occurrences in categorical columns:

# Count sales by state
state_counts = df['State'].value_counts()
print("Sales by state:")
print(state_counts)

Output:

Sales by state:
State
California    2
Texas         2
New York      1
Florida       1
Illinois      1
Name: count, dtype: int64

I executed the above example code and added the screenshot below.

What I like about this method is that it automatically handles the counting and sorting in descending order.

Method 4: Use query() Method

Python query() method provides a more readable syntax for filtering data:

# Count electronics sales over $1000
count = len(df.query('Category == "Electronics" and Amount > 1000'))
print(f"Electronics sales over $1,000: {count}")

This returns 3 – the number of electronics sales exceeding $1,000.

I find this approach particularly useful for complex conditions as it keeps the code clean.

Check out Convert DataFrame To NumPy Array Without Index in Python

Method 5: Use groupby() with size() or count()

When you need to count rows with multiple grouping conditions, groupby() combined with size() or count() is ideal in Python:

# Count sales by category and state
category_state_counts = df.groupby(['Category', 'State']).size()
print("Sales by category and state:")
print(category_state_counts)

Output:

Category     State      
Clothing     Florida        1
Electronics  California     1
             Illinois       1
             New York       1
             Texas          1
Furniture    California     1
             Texas          1

This approach is my go-to for multidimensional analysis.

Method 6: Use loc with sum()

For more complex filtering, combining loc with sum() method in Python can be powerful:

# Count high-value electronics sales
count = df.loc[(df['Category'] == 'Electronics') & (df['Amount'] > 1000)].shape[0]
print(f"High-value electronics sales: {count}")

This returns 3 – matching our previous query result.

The shape[0] attribute gives us the number of rows in the filtered DataFrame.

Method 7: Use count() with Boolean Indexing

Another variation uses the count() method in Python:

# Count non-furniture items
count = df[df['Category'] != 'Furniture']['Amount'].count()
print(f"Non-furniture sales: {count}")

This returns 5 as five sales aren’t in the furniture category.

This method is particularly useful when you’re also interested in handling null values differently.

Read a CSV into a dictionary using Pandas in Python

Method 8: Use numpy.count_nonzero()

If you’re working with numpy, you can use Python count_nonzero() method:

import numpy as np

# Count sales from Texas or California
count = np.count_nonzero((df['State'] == 'Texas') | (df['State'] == 'California'))
print(f"Sales from Texas or California: {count}")

This returns 4 as we have four sales from either Texas or California.

I use this approach when integrating pandas operations with other numpy-based code.

Method 9: Use apply() with a Custom Condition

For complex custom logic, the Python apply() function is extremely flexible:

# Count sales that meet a complex condition
def is_priority_sale(row):
    if row['Category'] == 'Electronics' and row['Amount'] > 1000:
        return True
    if row['State'] == 'California' and row['Amount'] > 700:
        return True
    return False

count = df.apply(is_priority_sale, axis=1).sum()
print(f"Priority sales: {count}")

This might return 4 depending on our specific data.

The apply() function allows you to write custom logic that’s more complex than what can be expressed in a simple query.

Check out Pandas Dataframe drop() Function in Python

Performance Considerations

When working with large datasets, performance becomes important. Here I have compared the methods.

Boolean indexing with sum() is generally the fastest for simple conditions
query() can be more readable but slightly slower
apply() with custom functions is the most flexible, but usually the slowest

For my everyday analysis tasks, I typically use boolean indexing with sum() for simple conditions and only reach for the more complex methods when necessary.

I hope you found this article helpful in understanding the various ways to count rows with conditions in Pandas. Each method has its strengths, and choosing the right one depends on your specific requirements.

How to Count Rows With Conditions in Pandas

Count Rows With Conditions in Pandas

Method 1: Use sum() with Boolean Indexing

Method 2: Use len() with Boolean Indexing

Method 3: Use value_counts() for Categorical Data

Method 4: Use query() Method

Method 5: Use groupby() with size() or count()

Method 6: Use loc with sum()

Method 7: Use count() with Boolean Indexing

Method 8: Use numpy.count_nonzero()

Method 9: Use apply() with a Custom Condition

Performance Considerations

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends