As a Python developer with over a decade of experience, I’ve seen firsthand how crucial evaluation metrics are when building machine learning models. One of the most fundamental tools to assess classification models is the confusion matrix.
In this article, I’ll walk you through what a confusion matrix is, how to generate one using Scikit-Learn, and different ways to visualize and interpret it. I’ll also share practical tips and examples that I’ve used in real-world projects so that you can apply them directly to your work.
Let’s get in!
What Is a Confusion Matrix?
A confusion matrix is a table that helps you visualize the performance of a classification algorithm. It compares the actual labels with the predicted labels and breaks down the results into four categories:
- True Positives (TP): Correctly predicted positive cases
- True Negatives (TN): Correctly predicted negative cases
- False Positives (FP): Incorrectly predicted positive cases (Type I error)
- False Negatives (FN): Incorrectly predicted negative cases (Type II error)
This breakdown helps you understand not just how many predictions were correct, but also the kinds of errors your model is making.
How to Create a Confusion Matrix in Scikit-Learn
Scikit-Learn makes it easy to create a confusion matrix. Here’s the method I use most often:
Method 1: Use the confusion_matrix Function
Python confusion_matrix function helps evaluate classification performance by showing the counts of true positives, false positives, true negatives, and false negatives.
from sklearn.metrics import confusion_matrix
import numpy as np
# Sample true and predicted labels (for example, fraud detection)
y_true = np.array([0, 0, 1, 1, 0, 1, 0, 1, 0, 1]) # Actual labels
y_pred = np.array([0, 1, 1, 1, 0, 0, 0, 1, 0, 1]) # Predicted labels
cm = confusion_matrix(y_true, y_pred)
print(cm)This will output a 2×2 matrix like:
[[4 1]
[1 4]]You can see the output in the screenshot below.

Here, the rows represent actual classes, and the columns represent predicted classes. The first row corresponds to actual class 0 (non-fraud), and the second row corresponds to actual class 1 (fraud).
Visualize the Confusion Matrix
Looking at raw numbers is helpful, but visualizing the confusion matrix makes it easier to interpret.
Method 2: Use ConfusionMatrixDisplay in Scikit-Learn
Since Scikit-Learn version 0.22, there’s a handy visualization function:
from sklearn.metrics import ConfusionMatrixDisplay
import matplotlib.pyplot as plt
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Non-Fraud', 'Fraud'])
disp.plot(cmap=plt.cm.Blues)
plt.show()You can see the output in the screenshot below.

This will create a neat heatmap-style confusion matrix, with labels and color intensity representing counts.
Normalize the Confusion Matrix
Sometimes, absolute numbers don’t tell the full story, especially if your classes are imbalanced. Normalizing the confusion matrix by row (actual class) shows the proportion of correct and incorrect predictions per class.
You can do this easily:
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=['Non-Fraud', 'Fraud'])
disp.plot(cmap=plt.cm.Greens, values_format='.2f', normalize='true')
plt.show()This shows percentages instead of raw counts, which can be more insightful for imbalanced datasets.
Interpret the Confusion Matrix
Once you have the confusion matrix, you can derive several key metrics:
- Accuracy: (TP + TN) / Total
- Precision: TP / (TP + FP) – How many predicted positives are positive?
- Recall (Sensitivity): TP / (TP + FN) – How many actual positives did we catch?
- F1 Score: Harmonic mean of precision and recall
I often calculate these metrics alongside the confusion matrix to get a full picture of model performance.
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
print("Accuracy:", accuracy_score(y_true, y_pred))
print("Precision:", precision_score(y_true, y_pred))
print("Recall:", recall_score(y_true, y_pred))
print("F1 Score:", f1_score(y_true, y_pred))Output:
Accuracy: 0.8
Precision: 0.8
Recall: 0.8
F1 Score: 0.8You can see the output in the screenshot below.

Confusion Matrix for Multiclass Classification
In real-world projects, you often deal with more than two classes. For example, predicting customer churn levels: Low, Medium, High.
Scikit-Learn’s confusion matrix works seamlessly with multiclass problems:
y_true = [0, 1, 2, 2, 0, 1, 0, 2]
y_pred = [0, 2, 2, 2, 0, 0, 0, 1]
cm = confusion_matrix(y_true, y_pred)
print(cm)Visualizing it is the same as for binary classification, just with more rows and columns.
Read Scikit-Learn accuracy_score
Bonus: Create a Custom Confusion Matrix Heatmap with Seaborn
For more customization, I like using Seaborn’s heatmap:
import seaborn as sns
import matplotlib.pyplot as plt
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=['Non-Fraud', 'Fraud'],
yticklabels=['Non-Fraud', 'Fraud'])
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()This allows you to control the style and add more context, which is useful when presenting results to stakeholders.
The confusion matrix is one of the most valuable tools in a data scientist’s toolkit. It goes beyond simple accuracy and helps you understand where your model is making mistakes. Using Scikit-Learn’s built-in functions, you can quickly generate and visualize confusion matrices for both binary and multiclass problems.
In my experience working with US-based clients across finance, healthcare, and marketing, presenting confusion matrices alongside precision and recall metrics has helped build confidence in machine learning models. It also guides you in tuning your models for better performance.
If you’re working on classification problems, I highly recommend incorporating confusion matrix analysis into your workflow. It’s simple, effective, and essential for delivering reliable results.
Other Python tutorials you may like:

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.