I recently worked on a data science project focused on analyzing rare events that occur within fixed time intervals. Specifically, I aimed to model customer arrivals at a coffee shop during the morning rush hour. The challenge involved predicting how staffing needs would change based on varying customer arrival rates.
This is where the Poisson distribution in SciPy came to my rescue.
In this article, I’ll show you how to use Python’s SciPy Stats Poisson distribution for various statistical calculations and real-world applications. I will cover everything from the basics to practical examples that you can implement right away.
The Poisson Distribution
The Poisson distribution models the number of events occurring in a fixed time or space interval when these events happen independently at a constant average rate.
For example, it can model:
- Number of customers arriving at a store per hour
- Number of calls received by a call center per minute
- Number of defects in a manufacturing process
The Poisson probability mass function is:
P(X = k) = (e^(-λ) * λ^k) / k!Where:
- λ (lambda) is the average rate of occurrence
- k is the number of occurrences
- e is Euler’s number (approximately 2.71828)
Import SciPy Stats Poisson
Let’s start by importing the necessary libraries:
import numpy as np
from scipy import stats
import matplotlib.pyplot as pltThe scipy.stats.poisson module provides various functions to work with the Poisson distribution.
SciPy Stats Poisson PMF (Probability Mass Function)
The PMF gives the probability of observing exactly k events. Here’s how to use it:
# Define lambda (average rate)
lambda_val = 3 # Average of 3 events
# Calculate PMF for k=0 to 10
k_values = np.arange(0, 11)
pmf_values = stats.poisson.pmf(k=k_values, mu=lambda_val)
# Print probabilities
for k, pmf in zip(k_values, pmf_values):
print(f"P(X = {k}) = {pmf:.4f}")
# Plot the PMF
plt.figure(figsize=(10, 6))
plt.bar(k_values, pmf_values)
plt.xlabel('Number of Events (k)')
plt.ylabel('Probability')
plt.title(f'Poisson PMF with λ = {lambda_val}')
plt.xticks(k_values)
plt.grid(alpha=0.3)
plt.show()You can refer to the screenshot below to see the output:

This code calculates the probability of observing 0, 1, 2, …, 10 events when the average rate is 3 events per time interval.
SciPy Stats Poisson CDF (Cumulative Distribution Function)
The CDF gives the probability of observing k or fewer events:
# Calculate CDF for k=0 to 10
cdf_values = stats.poisson.cdf(k=k_values, mu=lambda_val)
# Print cumulative probabilities
for k, cdf in zip(k_values, cdf_values):
print(f"P(X ≤ {k}) = {cdf:.4f}")
# Plot the CDF
plt.figure(figsize=(10, 6))
plt.step(k_values, cdf_values, where='post')
plt.xlabel('Number of Events (k)')
plt.ylabel('Cumulative Probability')
plt.title(f'Poisson CDF with λ = {lambda_val}')
plt.xticks(k_values)
plt.grid(alpha=0.3)
plt.show()You can refer to the screenshot below to see the output:

Generate Random Samples with SciPy Stats Poisson RVS
The Python SciPy Stats Poisson rvs function generates random samples from the Poisson distribution:
# Generate 1000 random samples with lambda=3
samples = stats.poisson.rvs(mu=lambda_val, size=1000, random_state=42)
# Plot histogram of samples
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=np.arange(0, max(samples)+1)-0.5, density=True, alpha=0.7)
# Overlay the PMF for comparison
x = np.arange(0, max(samples)+1)
pmf = stats.poisson.pmf(x, lambda_val)
plt.step(x, pmf, where='mid', linewidth=2)
plt.xlabel('Number of Events (k)')
plt.ylabel('Frequency / Probability')
plt.title(f'Poisson Random Samples vs PMF (λ = {lambda_val}, n = 1000)')
plt.grid(alpha=0.3)
plt.legend(['Theoretical PMF', 'Random Samples'])
plt.show()
# Calculate sample statistics
print(f"Sample Mean: {np.mean(samples):.4f}")
print(f"Sample Variance: {np.var(samples):.4f}")
print(f"Theoretical Mean and Variance: {lambda_val}")You can refer to the screenshot below to see the output:

Check out Python SciPy Butterworth Filter
Real-world Application: Customer Service Staffing
Let’s apply the Poisson distribution to a real-world scenario. Imagine we’re managing a customer service center in New York that receives an average of 15 calls per hour.
# Average call rate (lambda)
lambda_val = 15 # calls per hour
# Calculate probability of receiving more than 20 calls in an hour
prob_more_than_20 = 1 - stats.poisson.cdf(20, lambda_val)
print(f"Probability of receiving more than 20 calls in an hour: {prob_more_than_20:.4f} or {prob_more_than_20*100:.2f}%")
# Calculate minimum staff needed to handle call volume with 95% confidence
k = 0
while stats.poisson.cdf(k, lambda_val) < 0.95:
k += 1
print(f"To handle call volume with 95% confidence, you need {k} staff members")
# Simulate hourly call volumes for a week
hours = 24 * 7 # One week
hourly_calls = stats.poisson.rvs(mu=lambda_val, size=hours, random_state=42)
# Plot the hourly call volumes
plt.figure(figsize=(12, 6))
plt.plot(hourly_calls)
plt.axhline(y=lambda_val, color='r', linestyle='--', label='Average (λ)')
plt.xlabel('Hour')
plt.ylabel('Number of Calls')
plt.title('Simulated Hourly Call Volumes for a Week')
plt.grid(alpha=0.3)
plt.legend()
plt.show()Advanced Methods with SciPy Stats Poisson
Now, I will explain to you the advanced methods with SciPy stats poisson.
Read Working with Python, Lil_Matrix SciPy
Survival Function (SF)
The survival function gives the probability of observing more than k events:
# Calculate survival function for k=0 to 30
k_values = np.arange(0, 31)
sf_values = stats.poisson.sf(k=k_values, mu=lambda_val)
# Print probabilities
for k, sf in zip(k_values, sf_values):
if k % 5 == 0: # Print every 5th value to save space
print(f"P(X > {k}) = {sf:.4f}")Inverse Survival Function (ISF)
This function returns the value of k such that the survival function equals a given probability:
# Calculate the minimum number of staff needed to handle call volume with 99% confidence
confidence = 0.99
staff_needed = stats.poisson.isf(1 - confidence, mu=lambda_val)
print(f"To handle call volume with {confidence*100}% confidence, you need {staff_needed} staff members")Poisson Mean and Variance
The mean and variance of a Poisson distribution are both equal to λ:
# Calculate theoretical mean and variance
mean = stats.poisson.mean(mu=lambda_val)
variance = stats.poisson.var(mu=lambda_val)
print(f"Theoretical Mean: {mean}")
print(f"Theoretical Variance: {variance}")Compare Different Lambda Values
Let’s compare Poisson distributions with different lambda values:
# Define multiple lambda values
lambda_values = [1, 5, 10, 15]
k_values = np.arange(0, 30)
# Plot PMFs for different lambda values
plt.figure(figsize=(12, 7))
for lambda_val in lambda_values:
pmf_values = stats.poisson.pmf(k=k_values, mu=lambda_val)
plt.plot(k_values, pmf_values, marker='o', markersize=4, label=f'λ = {lambda_val}')
plt.xlabel('Number of Events (k)')
plt.ylabel('Probability')
plt.title('Poisson PMF for Different Lambda Values')
plt.grid(alpha=0.3)
plt.legend()
plt.show()Practical Example: Analyze Website Traffic
Let’s model website traffic for a small business website that gets an average of 8 visitors per hour:
# Average visitors per hour
lambda_val = 8
# Probability of getting more than 15 visitors in an hour
prob_busy_hour = stats.poisson.sf(15, lambda_val)
print(f"Probability of more than 15 visitors in an hour: {prob_busy_hour:.4f} or {prob_busy_hour*100:.2f}%")
# Simulate hourly traffic for a day
hours_in_day = 24
hourly_traffic = stats.poisson.rvs(mu=lambda_val, size=hours_in_day, random_state=42)
# Identify peak hours
peak_threshold = 12
peak_hours = [i for i, traffic in enumerate(hourly_traffic) if traffic >= peak_threshold]
print(f"Hourly traffic: {hourly_traffic}")
print(f"Peak hours (traffic ≥ {peak_threshold}): {peak_hours}")
# Plot the hourly traffic
plt.figure(figsize=(12, 6))
plt.bar(range(hours_in_day), hourly_traffic)
plt.axhline(y=lambda_val, color='r', linestyle='--', label='Average (λ)')
plt.axhline(y=peak_threshold, color='g', linestyle='--', label=f'Peak Threshold ({peak_threshold})')
plt.xlabel('Hour of Day')
plt.ylabel('Number of Visitors')
plt.title('Simulated Hourly Website Traffic')
plt.xticks(range(hours_in_day))
plt.grid(alpha=0.3)
plt.legend()
plt.show()The Poisson distribution is a powerful tool for modeling and analyzing discrete events that occur over time or space. SciPy’s implementation makes it easy to work with this distribution in Python.
From customer service staffing to website traffic analysis, the applications are vast and practical. The beauty of the Poisson distribution lies in its simplicity; with just one parameter (λ), you can model complex random processes.
I hope you found this guide helpful for understanding and using the Poisson distribution in SciPy. Next time you’re working with count data or events occurring over time, consider whether the Poisson distribution might be the right tool for your analysis.
You may like to read other SciPy-related tutorials:

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.