Python SciPy Stats Poisson

I recently worked on a data science project focused on analyzing rare events that occur within fixed time intervals. Specifically, I aimed to model customer arrivals at a coffee shop during the morning rush hour. The challenge involved predicting how staffing needs would change based on varying customer arrival rates.

This is where the Poisson distribution in SciPy came to my rescue.

In this article, I’ll show you how to use Python’s SciPy Stats Poisson distribution for various statistical calculations and real-world applications. I will cover everything from the basics to practical examples that you can implement right away.

Table of Contents

The Poisson Distribution

The Poisson distribution models the number of events occurring in a fixed time or space interval when these events happen independently at a constant average rate.

For example, it can model:

Number of customers arriving at a store per hour
Number of calls received by a call center per minute
Number of defects in a manufacturing process

The Poisson probability mass function is:

P(X = k) = (e^(-λ) * λ^k) / k!

Where:

λ (lambda) is the average rate of occurrence
k is the number of occurrences
e is Euler’s number (approximately 2.71828)

Read Python SciPy Stats Fit

Import SciPy Stats Poisson

Let’s start by importing the necessary libraries:

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

The scipy.stats.poisson module provides various functions to work with the Poisson distribution.

SciPy Stats Poisson PMF (Probability Mass Function)

The PMF gives the probability of observing exactly k events. Here’s how to use it:

# Define lambda (average rate)
lambda_val = 3  # Average of 3 events

# Calculate PMF for k=0 to 10
k_values = np.arange(0, 11)
pmf_values = stats.poisson.pmf(k=k_values, mu=lambda_val)

# Print probabilities
for k, pmf in zip(k_values, pmf_values):
    print(f"P(X = {k}) = {pmf:.4f}")

# Plot the PMF
plt.figure(figsize=(10, 6))
plt.bar(k_values, pmf_values)
plt.xlabel('Number of Events (k)')
plt.ylabel('Probability')
plt.title(f'Poisson PMF with λ = {lambda_val}')
plt.xticks(k_values)
plt.grid(alpha=0.3)
plt.show()

You can refer to the screenshot below to see the output:

This code calculates the probability of observing 0, 1, 2, …, 10 events when the average rate is 3 events per time interval.

SciPy Stats Poisson CDF (Cumulative Distribution Function)

The CDF gives the probability of observing k or fewer events:

# Calculate CDF for k=0 to 10
cdf_values = stats.poisson.cdf(k=k_values, mu=lambda_val)

# Print cumulative probabilities
for k, cdf in zip(k_values, cdf_values):
    print(f"P(X ≤ {k}) = {cdf:.4f}")

# Plot the CDF
plt.figure(figsize=(10, 6))
plt.step(k_values, cdf_values, where='post')
plt.xlabel('Number of Events (k)')
plt.ylabel('Cumulative Probability')
plt.title(f'Poisson CDF with λ = {lambda_val}')
plt.xticks(k_values)
plt.grid(alpha=0.3)
plt.show()

You can refer to the screenshot below to see the output:

Generate Random Samples with SciPy Stats Poisson RVS

The Python SciPy Stats Poisson rvs function generates random samples from the Poisson distribution:

# Generate 1000 random samples with lambda=3
samples = stats.poisson.rvs(mu=lambda_val, size=1000, random_state=42)

# Plot histogram of samples
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=np.arange(0, max(samples)+1)-0.5, density=True, alpha=0.7)

# Overlay the PMF for comparison
x = np.arange(0, max(samples)+1)
pmf = stats.poisson.pmf(x, lambda_val)
plt.step(x, pmf, where='mid', linewidth=2)

plt.xlabel('Number of Events (k)')
plt.ylabel('Frequency / Probability')
plt.title(f'Poisson Random Samples vs PMF (λ = {lambda_val}, n = 1000)')
plt.grid(alpha=0.3)
plt.legend(['Theoretical PMF', 'Random Samples'])
plt.show()

# Calculate sample statistics
print(f"Sample Mean: {np.mean(samples):.4f}")
print(f"Sample Variance: {np.var(samples):.4f}")
print(f"Theoretical Mean and Variance: {lambda_val}")

You can refer to the screenshot below to see the output:

Check out Python SciPy Butterworth Filter

Real-world Application: Customer Service Staffing

Let’s apply the Poisson distribution to a real-world scenario. Imagine we’re managing a customer service center in New York that receives an average of 15 calls per hour.

# Average call rate (lambda)
lambda_val = 15  # calls per hour

# Calculate probability of receiving more than 20 calls in an hour
prob_more_than_20 = 1 - stats.poisson.cdf(20, lambda_val)
print(f"Probability of receiving more than 20 calls in an hour: {prob_more_than_20:.4f} or {prob_more_than_20*100:.2f}%")

# Calculate minimum staff needed to handle call volume with 95% confidence
k = 0
while stats.poisson.cdf(k, lambda_val) < 0.95:
    k += 1

print(f"To handle call volume with 95% confidence, you need {k} staff members")

# Simulate hourly call volumes for a week
hours = 24 * 7  # One week
hourly_calls = stats.poisson.rvs(mu=lambda_val, size=hours, random_state=42)

# Plot the hourly call volumes
plt.figure(figsize=(12, 6))
plt.plot(hourly_calls)
plt.axhline(y=lambda_val, color='r', linestyle='--', label='Average (λ)')
plt.xlabel('Hour')
plt.ylabel('Number of Calls')
plt.title('Simulated Hourly Call Volumes for a Week')
plt.grid(alpha=0.3)
plt.legend()
plt.show()

Advanced Methods with SciPy Stats Poisson

Now, I will explain to you the advanced methods with SciPy stats poisson.

Read Working with Python, Lil_Matrix SciPy

Survival Function (SF)

The survival function gives the probability of observing more than k events:

# Calculate survival function for k=0 to 30
k_values = np.arange(0, 31)
sf_values = stats.poisson.sf(k=k_values, mu=lambda_val)

# Print probabilities
for k, sf in zip(k_values, sf_values):
    if k % 5 == 0:  # Print every 5th value to save space
        print(f"P(X > {k}) = {sf:.4f}")

Inverse Survival Function (ISF)

This function returns the value of k such that the survival function equals a given probability:

# Calculate the minimum number of staff needed to handle call volume with 99% confidence
confidence = 0.99
staff_needed = stats.poisson.isf(1 - confidence, mu=lambda_val)
print(f"To handle call volume with {confidence*100}% confidence, you need {staff_needed} staff members")

Poisson Mean and Variance

The mean and variance of a Poisson distribution are both equal to λ:

# Calculate theoretical mean and variance
mean = stats.poisson.mean(mu=lambda_val)
variance = stats.poisson.var(mu=lambda_val)
print(f"Theoretical Mean: {mean}")
print(f"Theoretical Variance: {variance}")

Compare Different Lambda Values

Let’s compare Poisson distributions with different lambda values:

# Define multiple lambda values
lambda_values = [1, 5, 10, 15]
k_values = np.arange(0, 30)

# Plot PMFs for different lambda values
plt.figure(figsize=(12, 7))

for lambda_val in lambda_values:
    pmf_values = stats.poisson.pmf(k=k_values, mu=lambda_val)
    plt.plot(k_values, pmf_values, marker='o', markersize=4, label=f'λ = {lambda_val}')

plt.xlabel('Number of Events (k)')
plt.ylabel('Probability')
plt.title('Poisson PMF for Different Lambda Values')
plt.grid(alpha=0.3)
plt.legend()
plt.show()

Practical Example: Analyze Website Traffic

Let’s model website traffic for a small business website that gets an average of 8 visitors per hour:

# Average visitors per hour
lambda_val = 8

# Probability of getting more than 15 visitors in an hour
prob_busy_hour = stats.poisson.sf(15, lambda_val)
print(f"Probability of more than 15 visitors in an hour: {prob_busy_hour:.4f} or {prob_busy_hour*100:.2f}%")

# Simulate hourly traffic for a day
hours_in_day = 24
hourly_traffic = stats.poisson.rvs(mu=lambda_val, size=hours_in_day, random_state=42)

# Identify peak hours
peak_threshold = 12
peak_hours = [i for i, traffic in enumerate(hourly_traffic) if traffic >= peak_threshold]

print(f"Hourly traffic: {hourly_traffic}")
print(f"Peak hours (traffic ≥ {peak_threshold}): {peak_hours}")

# Plot the hourly traffic
plt.figure(figsize=(12, 6))
plt.bar(range(hours_in_day), hourly_traffic)
plt.axhline(y=lambda_val, color='r', linestyle='--', label='Average (λ)')
plt.axhline(y=peak_threshold, color='g', linestyle='--', label=f'Peak Threshold ({peak_threshold})')
plt.xlabel('Hour of Day')
plt.ylabel('Number of Visitors')
plt.title('Simulated Hourly Website Traffic')
plt.xticks(range(hours_in_day))
plt.grid(alpha=0.3)
plt.legend()
plt.show()

The Poisson distribution is a powerful tool for modeling and analyzing discrete events that occur over time or space. SciPy’s implementation makes it easy to work with this distribution in Python.

From customer service staffing to website traffic analysis, the applications are vast and practical. The beauty of the Poisson distribution lies in its simplicity; with just one parameter (λ), you can model complex random processes.

I hope you found this guide helpful for understanding and using the Poisson distribution in SciPy. Next time you’re working with count data or events occurring over time, consider whether the Poisson distribution might be the right tool for your analysis.

You may like to read other SciPy-related tutorials:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/

Python SciPy Stats Poisson

The Poisson Distribution

Import SciPy Stats Poisson

SciPy Stats Poisson PMF (Probability Mass Function)

SciPy Stats Poisson CDF (Cumulative Distribution Function)

Generate Random Samples with SciPy Stats Poisson RVS

Real-world Application: Customer Service Staffing

Advanced Methods with SciPy Stats Poisson

Survival Function (SF)

Inverse Survival Function (ISF)

Poisson Mean and Variance

Compare Different Lambda Values

Practical Example: Analyze Website Traffic

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends