How to Use the Mean() Function in Python?

In this tutorial, I will explain how to use the mean() function in Python. As a data analyst, I came across a scenario where I needed to use the mean() function to perform some calculations. I explored more about this topic and I will share my findings in this article. Let us learn more about this topic today.

Mean() Function in Python

The mean, often referred to as the average, is a measure of central tendency that sums up all the numbers in a dataset and then divides by the count of those numbers. It provides a single value that represents the center point of the data.

Read How to Use Single and Double Quotes in Python?

Built-in Mean() Function in Python

Python provides several ways to calculate the mean. The simplest method is using the built-in sum() and len() functions.

def calculate_mean(numbers):
    return sum(numbers) / len(numbers)

# Example
salaries = [55000, 48000, 75000, 62000, 90000]
mean_salary = calculate_mean(salaries)
print(f"The average salary is ${mean_salary:.2f}")

Output:

The average salary is $66000.00

I have executed the above code and added the screenshot below.

Mean() Function in Python

In the example above, we calculate the mean salary of a list of salaries. This method is simple but can be inefficient for large datasets.

Check out Access Modifiers in Python

Use the Statistics Library

Python’s statistics library provides a built-in function mean() that simplifies the calculation.

import statistics

# Example
salaries = [55000, 48000, 75000, 62000, 90000]
mean_salary = statistics.mean(salaries)
print(f"The average salary is ${mean_salary:.2f}")

Output:

The average salary is $66000.00

I have executed the above code and added the screenshot below.

Use the Mean() Function in Python

The statistics.mean() function is optimized and more readable. It is suitable for most basic applications.

Read Python 3 vs Python 2

Use Python NumPy for Mean Calculation

For more advanced data analysis, especially with large datasets, the NumPy library is highly efficient. NumPy is a powerful library for numerical computations in Python.

import numpy as np

# Example
salaries = np.array([55000, 48000, 75000, 62000, 90000])
mean_salary = np.mean(salaries)
print(f"The average salary is ${mean_salary:.2f}")

Output:

The average salary is $66000.00

I have executed the above code and added the screenshot below.

How to Use the Mean() Function in Python

NumPy’s mean() function is optimized for performance and can handle large arrays of data efficiently. It’s a go-to solution for data scientists and analysts working with extensive datasets.

Check out Difference Between “is None” and “== None” in Python

Example: Analyze Housing Prices

Let’s consider a real-world scenario where you need to analyze the average housing prices in different cities across the USA. This example will demonstrate how to use the mean function with a more complex dataset.

import numpy as np

# Example dataset: Average housing prices in thousands of dollars
housing_prices = {
    'New York': [850, 870, 900, 920, 950],
    'San Francisco': [1200, 1250, 1300, 1350, 1400],
    'Chicago': [450, 470, 480, 500, 520],
    'Austin': [300, 320, 340, 360, 380],
    'Seattle': [700, 720, 750, 780, 800]
}

# Calculate mean housing prices
mean_prices = {city: np.mean(prices) for city, prices in housing_prices.items()}

for city, mean_price in mean_prices.items():
    print(f"The average housing price in {city} is ${mean_price:.2f}k")

In this example, we use a dictionary to store housing prices for different cities. We then calculate and print the average housing price for each city using NumPy’s mean() function.

Check out How to Comment Out a Block of Code in Python?

Handle Missing Data

In real-world datasets, you may encounter missing values. It’s important to handle these values appropriately to avoid skewing the mean calculation. NumPy provides functions to handle missing data efficiently.

import numpy as np

# Example dataset with missing values (represented by np.nan)
salaries = np.array([55000, 48000, np.nan, 62000, 90000])

# Calculate mean while ignoring NaN values
mean_salary = np.nanmean(salaries)
print(f"The average salary, ignoring missing values, is ${mean_salary:.2f}")

The np.nanmean() function calculates the mean while ignoring NaN values, ensuring accurate results even with incomplete data.

Check out Difference Between {} and [] in Python

Performance Considerations

When working with large datasets, performance is crucial. NumPy is optimized for performance, but for extremely large datasets, consider using libraries like pandas which provide additional functionality and optimizations.

import pandas as pd

# Example dataset
data = {
    'City': ['New York', 'San Francisco', 'Chicago', 'Austin', 'Seattle'],
    'Housing Prices': [[850, 870, 900, 920, 950], [1200, 1250, 1300, 1350, 1400], [450, 470, 480, 500, 520], [300, 320, 340, 360, 380], [700, 720, 750, 780, 800]]
}

# Create DataFrame
df = pd.DataFrame(data)

# Calculate mean housing prices
df['Mean Housing Price'] = df['Housing Prices'].apply(np.mean)
print(df[['City', 'Mean Housing Price']])

In this example, we use pandas to create a DataFrame and calculate the mean housing prices for each city. Pandas provides efficient data manipulation and analysis capabilities, making it ideal for large-scale data analysis.

Read Compare Lists, Tuples, Sets, and Dictionaries in Python

Conclusion

In this tutorial, I have explained how to use the mean() function in Python. I discussed the mean() function in Python , using the statistics library , using Python NumPy for Mean Calculation , and examples. We saw how to handle missing data and performance considerations.

You may also like to read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.