Plot a Best Fit Line in Matplotlib

I’ve found that visualizing data effectively is just as important as analyzing it. One of the most common tasks I encounter is plotting a best-fit line to understand trends and relationships within data.

Matplotlib, Python’s go-to plotting library, provides easy ways to add a best-fit line to your scatter plots. In this article, I’ll walk you through different methods to plot a best-fit line using Matplotlib. These techniques will help you present your data clearly and professionally.

Let’s get started!

What is a Best Fit Line?

A best-fit line, also known as a trend line or regression line, is a straight line that best represents the relationship between two variables. It minimizes the distance between the data points and the line itself. This line helps you understand whether there’s a positive, negative, or no correlation between the variables.

For example, if you’re analyzing monthly sales data across different U.S. regions, a best-fit line can show if sales are generally increasing or decreasing over time.

Method 1: Use NumPy’s polyfit Function with Matplotlib

The quickest way I use to plot a best-fit line is by leveraging NumPy’s polyfit function, which fits a polynomial (a line, in this case) to your data.

Steps

  1. Import the necessary libraries:
import numpy as np
import matplotlib.pyplot as plt
  1. Create your data (for instance, monthly sales figures):
months = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
sales = np.array([200, 220, 250, 270, 300, 310, 330, 360, 390, 420, 450, 480])
  1. Plot your scatter plot:
plt.scatter(months, sales, color='blue', label='Sales Data')
  1. Calculate the best-fit line coefficients:
coefficients = np.polyfit(months, sales, 1)  # 1 means linear
slope, intercept = coefficients
  1. Generate the y-values for the best-fit line:
best_fit_line = slope * months + intercept
  1. Plot the best-fit line:
plt.plot(months, best_fit_line, color='red', label='Best Fit Line')
  1. Add labels and legend, then show the plot:
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales with Best Fit Line')
plt.legend()
plt.show()

I executed the above example code and added the screenshot below.

line of best fit python

This method is fast and effective for linear trends. I often use it for quick exploratory data analysis during projects.

Read Matplotlib Legend Font Size

Method 2: Use SciPy’s linregress for Statistical Details

If you want more statistical insights like the correlation coefficient, p-value, or standard error, SciPy’s linregress function is a great option.

  1. Import SciPy’s stats module along with Matplotlib:
from scipy import stats
import matplotlib.pyplot as plt
import numpy as np
  1. Prepare your data (similar to the previous example).
  2. Perform linear regression:
slope, intercept, r_value, p_value, std_err = stats.linregress(months, sales)
  1. Calculate the best-fit line:
best_fit_line = slope * months + intercept
  1. Plot data and line:
plt.scatter(months, sales, color='green', label='Sales Data')
plt.plot(months, best_fit_line, color='orange', label='Best Fit Line')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales with Regression Line')
plt.legend()
plt.show()
  1. Optionally, print the regression statistics:
print(f"R-squared: {r_value**2:.3f}")
print(f"P-value: {p_value:.4f}")

I executed the above example code and added the screenshot below.

python fit line

This method is my go-to when I need to validate the strength and significance of the trend before making business decisions.

Check out Matplotlib Secondary y-Axis

Method 3: Use Seaborn’s regplot for Quick Visualization

If you prefer a higher-level library built on Matplotlib, Seaborn’s regplot can plot scatter points and the regression line in one call.

Steps:

  1. Install Seaborn if you haven’t already:
pip install seaborn
  1. Import Seaborn and prepare data:
import seaborn as sns
import pandas as pd

data = pd.DataFrame({
   'Month': months,
   'Sales': sales
   })
  1. Plot with regression line:
sns.regplot(x='Month', y='Sales', data=data)
plt.title('Sales Trend with Best Fit Line')
plt.show()

I executed the above example code and added the screenshot below.

matplotlib line of best fit

Seaborn handles the fitting internally and provides confidence intervals by default. I find this method handy when creating polished visualizations for reports or presentations.

Read Matplotlib Set Axis Range

Tips for Better Best Fit Line Visualizations

  • Label your axes clearly — It helps stakeholders understand the data context.
  • Choose colors wisely — Make sure your best-fit line stands out but doesn’t overpower the scatter points.
  • Check assumptions — Linear regression assumes a linear relationship; if your data is nonlinear, consider polynomial fits.
  • Add statistical info — Displaying R-squared or p-values can strengthen your analysis credibility.

Plotting a best-fit line in Matplotlib is an essential skill for any Python developer working with data. Whether you want a quick visual trend line or detailed statistical insights, the methods I shared will cover your needs.

I start with NumPy polyfit for quick checks, then move to SciPy’s linregress when I need to dig deeper. For elegant visualizations, Seaborn regplot is my favorite.

You may like to read other tutorials:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.