Python SciPy Stats Mode With Examples

Recently, I was working on a data analysis project where I needed to find the most frequently occurring values in my datasets. The mode is an important statistical measure that represents the most common value in a dataset, and SciPy’s stats module makes calculating it simple in Python.

In this article, I’ll cover multiple ways to calculate the mode using SciPy’s stats module, with practical examples that show you how to apply these techniques to your data analysis tasks. So let’s dive in!

This Tutorial Covers:

Mode in Statistics

The mode is the value that appears most frequently in a dataset. Unlike mean and median, which provide measures of central tendency, the mode tells us what values occur most often.

For example, if you’re analyzing customer purchase data, the mode can tell you which product is purchased most frequently. Or if you’re analyzing survey responses, it can identify the most common answer.

A dataset can have one mode (unimodal), two modes (bimodal), or multiple modes (multimodal). It’s also possible for no value to occur more frequently than others, resulting in no mode.

Use SciPy’s stats.mode Function

SciPy’s stats module provides a simple function called mode() that makes finding the most common values in your data simple. Let’s see how to use it:

from scipy import stats
import numpy as np

# Sample data
data = [1, 2, 3, 3, 3, 4, 5, 5, 5, 5]

# Calculate the mode (with keepdims=True for consistent shape)
result = stats.mode(data, keepdims=True)

print(f"Mode: {result.mode[0]}")
print(f"Count: {result.count[0]}")

Output:

Mode: 5
Count: 4

You can see the output in the screenshot below.

In this example, the value 5 appears most frequently (4 times), making it the mode of our dataset.

The mode() function returns two values:

mode: The most frequent value(s)
count: The number of times each mode appears

Read Python SciPy Stats Fit

Handle Multiple Modes

Sometimes your data might have multiple values that occur with the same frequency. Let’s see how SciPy handles this situation:

from scipy import stats

# Data with multiple modes
data = [1, 2, 2, 3, 3, 4, 5]

# Add keepdims=True to get array output
result = stats.mode(data, keepdims=True)

print(f"Mode: {result.mode[0]}")
print(f"Count: {result.count[0]}")

Output:

Mode: 2
Count: 2

You can see the output in the screenshot below.

In this case, both 2 and 3 appear twice, but SciPy’s mode() function returns only the first one it encounters (2). If you need to find all modes, you’ll need a different approach.

Check out Python SciPy Butterworth Filter

Find All Modes in a Dataset

To find all modes in a dataset, we can use a combination of NumPy and Python’s built-in functionality:

import numpy as np
from collections import Counter

def find_all_modes(data):
    # Count occurrences of each value
    count = Counter(data)

    # Find the highest frequency
    max_count = max(count.values())

    # If all values appear only once, there's no mode
    if max_count == 1:
        return "No mode found"

    # Return all values that appear with the highest frequency
    return [k for k, v in count.items() if v == max_count]

# Sample data
data = [1, 2, 2, 3, 3, 4, 5]

# Find all modes
all_modes = find_all_modes(data)
print(f"All modes: {all_modes}")

Output:

All modes: [2, 3]

You can see the output in the screenshot below.

This approach correctly identifies both 2 and 3 as modes in our dataset.

Read Python SciPy IIR Filter

Use Mode on Multidimensional Arrays

SciPy’s mode() function also works with multidimensional arrays, which is useful for more complex datasets:

from scipy import stats
import numpy as np

# 2D array example - survey responses from different departments
data = np.array([
    [5, 4, 5, 5, 3],  # Department 1 responses
    [2, 2, 3, 3, 2],  # Department 2 responses
    [4, 4, 4, 5, 5]   # Department 3 responses
])

# Calculate mode along axis 1 (rows)
result = stats.mode(data, axis=1)
print("Mode for each department:")
print(f"Mode values: {result.mode}")
print(f"Mode counts: {result.count}")

# Calculate mode along axis 0 (columns)
result = stats.mode(data, axis=0)
print("\nMode for each question across departments:")
print(f"Mode values: {result.mode}")
print(f"Mode counts: {result.count}")

Output:

Mode for each department:
Mode values: [[5 2 4]]
Mode counts: [[3 3 3]]

Mode for each question across departments:
Mode values: [[4 4 3 5 3]]
Mode counts: [[1 1 1 2 1]]

When we calculate the mode along axis 1, we get the most common response within each department. When we calculate along axis 0, we get the most common response for each question across all departments.

Read Python SciPy Sparse

Mode vs. Mean and Median

Let’s compare the mode with other measures of central tendency:

import numpy as np
from scipy import stats

# Sample sales data (in thousands of dollars)
sales_data = [10, 12, 15, 15, 15, 18, 20, 25, 30, 150]

# Calculate mode
mode_result = stats.mode(sales_data)
mode_value = mode_result.mode[0]

# Calculate mean
mean_value = np.mean(sales_data)

# Calculate median
median_value = np.median(sales_data)

print(f"Mode: ${mode_value}k")
print(f"Mean: ${mean_value:.1f}k")
print(f"Median: ${median_value:.1f}k")

Output:

Mode: $15k
Mean: $31.0k
Median: $16.5k

In this example of sales data with an outlier ($150k), the mode ($15k) and median ($16.5k) provide more representative measures of typical sales than the mean ($31.0k), which is heavily influenced by the outlier.

Check out How to use Python SciPy

Real-World Example: Analyze Survey Data

Let’s apply the mode to a real-world scenario – analyzing responses to a customer satisfaction survey:

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Customer satisfaction ratings (1-5 scale)
# Data for different products
product_A = [5, 4, 5, 3, 5, 4, 5, 5, 2, 5, 4, 5, 3, 5, 5]
product_B = [3, 4, 3, 3, 2, 4, 3, 3, 4, 2, 3, 3, 4, 3, 3]
product_C = [2, 2, 1, 3, 2, 2, 1, 2, 3, 2, 1, 2, 2, 1, 2]

# Calculate modes
mode_A = stats.mode(product_A).mode[0]
mode_B = stats.mode(product_B).mode[0]
mode_C = stats.mode(product_C).mode[0]

# Calculate means for comparison
mean_A = np.mean(product_A)
mean_B = np.mean(product_B)
mean_C = np.mean(product_C)

# Create bar chart
products = ['Product A', 'Product B', 'Product C']
modes = [mode_A, mode_B, mode_C]
means = [mean_A, mean_B, mean_C]

x = np.arange(len(products))
width = 0.35

fig, ax = plt.subplots(figsize=(10, 6))
ax.bar(x - width/2, modes, width, label='Mode')
ax.bar(x + width/2, means, width, label='Mean')

ax.set_ylabel('Rating')
ax.set_title('Customer Satisfaction by Product')
ax.set_xticks(x)
ax.set_xticklabels(products)
ax.legend()

plt.ylim(0, 5.5)
for i, v in enumerate(modes):
    ax.text(i - width/2, v + 0.1, str(v), ha='center')

for i, v in enumerate(means):
    ax.text(i + width/2, v + 0.1, f"{v:.1f}", ha='center')

plt.tight_layout()
plt.show()

This example would create a bar chart comparing the mode and mean satisfaction ratings for three different products. The mode quickly shows us which rating appears most frequently for each product, providing insight into the typical customer experience.

Read Working with Python, Lil_Matrix SciPy

When to Use Mode in Your Analysis

The mode is particularly useful in these scenarios:

When analyzing categorical data, where the mean and median don’t make sense
When identifying the most common behaviors or preferences
When dealing with skewed distributions where the mean might be misleading
In multimodal distributions, to identify multiple peaks or common values

I hope you found this article helpful for understanding and implementing mode calculations in Python using SciPy. The mode might not get as much attention as the mean and median, but it’s a powerful statistical measure that can provide unique insights into your data, especially when dealing with categorical variables or skewed distributions.

Python SciPy Stats Mode with Examples

Mode in Statistics

Use SciPy’s stats.mode Function

Handle Multiple Modes

Find All Modes in a Dataset

Use Mode on Multidimensional Arrays

Mode vs. Mean and Median

Real-World Example: Analyze Survey Data

When to Use Mode in Your Analysis

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends