How to Create Plots using Pandas crosstab() in Python

This Python tutorial explains, how to create plots using Pandas crosstab() in Python. We will explore the crosstab() function in Pandas and show you how to use it to create informative plots that can help you better understand your data.

Crosstab() is a simple yet effective method to summarize and analyze data by displaying the relationship between categorical variables in a tabular form. A single image can describe thousands of words. The purpose of visualization through Pandas crosstab() is to gain insight into the relationships between two or more categorical variables in a dataset and to make our stakeholders understand the actual relation between variables through graphs and plots.

Through this python tutorial, we will be covering the following topics:

  • Purpose of creating Visualization plots using pandas crosstab()
  • Loading dataset to create plots using pandas crosstab()
  • Barplot using pandas crosstab()
  • Stacked barplot using pandas crosstab()
  • Piecharts using pandas crosstab()
  • Boxplots using pandas crosstab()
  • Line chart using pandas crosstab()
  • Area chart using pandas crosstab()

Purpose of creating Visualization plots using crosstab()

  • Visualization plots can help you understand the patterns and relationships in the data better by providing a visual summary of the cross-table we have generated.
  • Creating plots using crosstab() in pandas can help us identify interesting and significant patterns or trends in the data.
  • One need not be technically strong enough to understand the relationship between the variables in the data.

Loading dataset to create plots using pandas crosstab()

Let’s consider a real-life example where pandas crosstab() can be used to analyze data. We can download the dataset from

“https://www.kaggle.com/code/fourbic/visualizing-the-titanic-data-with-seaborn”

or else we can directly load the same dataset from the seaborn to understand the working of the crosstab() function in pandas python.

#Import the necessary libraries 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

#Load the dataset after downloading manually from kaggle
titanic_data=pd.read_csv("titanic.csv")
titanic_data.head()
(or)
#Load the dataset using seaborn library without downloading
titanic_data=sns.load_dataset("titanic")
titanic_data.head()

Below is a titanic dataset that has information about all 891 passengers who traveled during the incident it has data like passengers’ gender, the class they are traveling whether survived or not, fare paid for the ticket, etc.

Create Plots using Pandas crosstab() in Python
Titanic dataset describing passengers data that traveled during the incident

Let us create a crosstable using pandas crosstab() in python. If you don’t have any idea about crosstab() in pandas you can visit “https://pythonguides.com/crosstab-in-python-pandas/”

#create a crosstable using crosstab() function in pandas
titanic_crosstable=pd.crosstab(index=titanic_data['class'], columns=titanic_data['alive']) 
  • From the below output crosstable, we can observe that 80 people who traveled in first class were not able to survive during the titanic ship accident.
  • Likewise, 136 people traveled in first class and were able to survive the titanic ship accident. People who traveled in first class have more chances of surviving.
Create Plots using Pandas crosstab() in Python
sample crosstable in python pandas

How to create a Barplot using pandas crosstab()

A bar graph is a visualization plot that shows the relationship between categorical and numeric variables. We will create a barplot by calling the function plot() and passing “bar” as the value to the parameter “kind”.

#Creating a barplot using Crosstab function in pandas python
titanic_crosstable.plot(kind='bar')
plt.xlabel('Class')
plt.ylabel('Number of passengers')
plt.title('Survival rate of passengers by class they traveled')
plt.show()
  • We have to set the “bar” to the parameter “kind” in order to create a bar chart using crosstab.
  • In this example, the titanic_crosstable above created is being plotted as a bar graph using crosstab() in pandas.
  • The graph depicts that passengers who traveled in first class had more survival rate and people who traveled in third class had less survival rate.
  • The orange color depicts that the passenger survived whereas the blue color depicts that the passenger was not survived
create a Barplot using pandas crosstab()
Creating Barplot using crosstab() in pandas

This is how to create a Barplot using pandas crosstab() in python machine learning.

How to create a Stacked Barplot using pandas crosstab()

A stacked chart is a perfect tool for comparing the results. It has one categorical axis and may have up to two numerical axes. The data in the stacked charts can be represented in both ways that are either in horizontal or vertical formats.

# Creating stacked barplot plot using the pandas crosstab() in python
titanic_crosstable.plot(kind='bar', stacked=True)
plt.xlabel('Class')
plt.ylabel('Number of passengers')
plt.title('Survival rate of passengers by class they traveled(stacked bar chart)')
plt.show()
  • We have to set the “bar” to the parameter “kind” and “True” to the parameter “stacked” in order to create a stacked bar chart using pandas crosstab() in python.
  • In this example, the titanic_crosstable above created is being plotted as a stacked bar graph using crosstab() in pandas.
  • The graph depicts people who traveled in the second class having an even more survival rate than people traveling in first class.
  • For comparison purposes, a stacked bar chart is the best option.
create a Stacked Barplot using pandas crosstab()
Creating a stacked bar chart using pandas crosstab

This is how to create a Stacked Barplot using pandas crosstab() in Python machine learning.

How to create a piechart using pandas crosstab()

A pie chart divides the circle into several segments. Each segment describes a proportion of the circle. it is one such graph that has the capability of displaying multiple classes of data in one frame.


# Creating piechart based on product category using the pandas crosstab() in python
titanic_crosstable.plot(kind='pie', subplots=True, figsize=(8, 4))
plt.xlabel('Class')
plt.ylabel('Number of passengers')
plt.title('Survival rate of passengers by class they traveled(pie chart)')
plt.show()
  • We have to set “pie” to the parameter “kind” in order to create a pie chart using crosstab.
  • From the below graphs, we can observe that the left side pie chart depicts people who are unable to survive during the titanic incident by the classes they traveled.
create a piechart using pandas crosstab()
pie chart using crosstab() function in pandas

This is how to create a piechart using pandas crosstab() in Python machine learning.

How to create boxplots using pandas crosstab()

Boxplot is the only plot that has the ability to describe the five-number summary. It will tell the minimum value, first quartile, and second quartile which is nothing but the median, third quartile, and maximum value in that particular category.

# Creating a boxplot using the crosstab() in pandas python
titanic_crosstable.plot(kind='box')
plt.xlabel('Survival case')
plt.ylabel('Number of passengers')
plt.title('Displaying boxplots by survival case')
plt.show()
  • We have to set “box” to the parameter “kind” in order to create a boxplot using crosstab.
  • From the above plot, we can observe the more passengers were unable to survive.
create boxplots using pandas crosstab()
Boxplot using pandas crosstab() in python

This is how to create boxplots using pandas crosstab() in Python.

How to create a Line chart using pandas crosstab()

A line graph is a type of chart or graph that is used to show information that changes over time. A line graph using pandas crosstab can be plotted using several points connected by straight lines. The line graph usually gives a clear picture of an increasing or a decreasing trend.

#Creating a Line chart using crosstab() function in pandas python
titanic_crosstable.plot(kind='line')
plt.xlabel('class')
plt.ylabel('Number of passengers')
plt.title('Displaying line chart')
plt.show()
  • We have to set “line” to the parameter “kind” in order to create a line chart using crosstab.
  • From the above plot, we can observe the increase in trend in the non survival rate while moving from second class to third class.
create a Line chart using pandas crosstab()
Create a line chart using crosstab() in pandas

This is how to create a Line chart using pandas crosstab() in Python machine learning.

How to create an Area chart using pandas crosstab()

An area chart is a combination of a line chart and a bar chart to show how one or more groups’ numeric values change over the progression of the other variable and change over time.

#Creating a Area chart using crosstab() function in pandas python
pd.crosstab(titanic_data['class'], titanic_data['alive']).plot(kind="area")
plt.title('Representing survival rate of all passengers through area plot')
plt.show()
  • We have to set “area” to the parameter “kind” in order to create an area chart using pandas crosstab in python.
  • From the below plot, we can observe that the chances of surviving are almost equal in the first and second class whereas the survival rate is reduced while moving to the third class
create an Area chart using pandas crosstab()
Create an area chart using pandas crosstab() in python

This is how to create an Area chart using pandas crosstab() in Python.

Conclusion

Through this pandas crosstab plots tutorial, we have covered topics like the importance of visualization, creating barplot, stacked barplot, boxplot, pie charts, line charts, and area charts using functions like plot() on the data frame like crosstable in python pandas.

  • How to create a Barplot using pandas crosstab()
  • How to create a Stacked Barplot using pandas crosstab()
  • How to create a piechart using pandas crosstab()
  • How to create boxplots using pandas crosstab()
  • How to create a Line chart using pandas crosstab()
  • How to create an Area chart using pandas crosstab()

You may also like: