Pandas Sort by Multiple Columns

In my years of working with data in Python, I have found that organizing information is often the most critical step in any analysis.

Whether I am looking at US real estate trends or analyzing stock market fluctuations, I frequently need to sort my data by more than one criteria.

In this tutorial, I will show you exactly how to use the Pandas library to sort DataFrames by multiple columns efficiently.

I have used these techniques in countless professional projects to turn messy datasets into clear, actionable insights.

This Tutorial Covers:

Why You Need to Sort by Multiple Columns

When I first started using Pandas, I often found myself overwhelmed by large datasets containing thousands of rows.

Sorting by a single column, like “State,” is helpful, but it usually isn’t enough to get the full picture.

For instance, if I am looking at US Census data, I might want to see the states in alphabetical order and then see the cities within each state ranked by population.

This is where sorting by multiple columns becomes a lifesaver for any developer.

The Basic Syntax of sort_values()

To sort data in Pandas, I always rely on the sort_values() method. It is incredibly flexible and powerful.

When I want to sort by multiple columns, I simply pass a list of column names to the method instead of a single string.

The basic structure looks like this: df.sort_values(by=[‘Column1’, ‘Column2’]). By default, Pandas will sort both columns in ascending order.

Sort by Multiple Columns (Ascending Order)

Let’s look at a practical example using a dataset of US tech companies and their office locations.

In this scenario, I want to organize the data first by the state and then by the city name.

import pandas as pd

# Creating a dataset of US Tech Hubs
data = {
    'Company': ['Google', 'Microsoft', 'Apple', 'Meta', 'Amazon', 'Netflix', 'Tesla'],
    'State': ['California', 'Washington', 'California', 'California', 'Washington', 'California', 'Texas'],
    'City': ['Mountain View', 'Redmond', 'Cupertino', 'Menlo Park', 'Seattle', 'Los Gatos', 'Austin'],
    'Employees': [150000, 180000, 160000, 70000, 1100000, 12000, 120000]
}

df = pd.DataFrame(data)
sorted_df = df.sort_values(by=['State', 'City'])

print("Data sorted by State and City (Ascending):")
print(sorted_df)

I executed the above example code and added the screenshot below.

In the code above, I passed a list [‘State’, ‘City’] to the by parameter. Pandas first grouped all the “California” entries together and then arranged the cities like Cupertino and Los Gatos alphabetically.

Mix Sort Orders (Ascending and Descending)

There are many times in my career where I needed to sort one column upwards and another downwards.

For example, I might want to see US states alphabetically (A-Z), but within each state, I want to see the companies with the highest number of employees first.

To do this, I use the ascending parameter and pass a list of booleans that match the column list.

import pandas as pd

# Using the same US Tech Hubs dataset
data = {
    'Company': ['Google', 'Microsoft', 'Apple', 'Meta', 'Amazon', 'Netflix', 'Tesla'],
    'State': ['California', 'Washington', 'California', 'California', 'Washington', 'California', 'Texas'],
    'City': ['Mountain View', 'Redmond', 'Cupertino', 'Menlo Park', 'Seattle', 'Los Gatos', 'Austin'],
    'Employees': [150000, 180000, 160000, 70000, 1100000, 12000, 120000]
}

df = pd.DataFrame(data)
sorted_df = df.sort_values(by=['State', 'Employees'], ascending=[True, False])

print("Data sorted by State (A-Z) and Employees (Highest to Lowest):")
print(sorted_df)

I executed the above example code and added the screenshot below.

In my experience, this is one of the most useful features of Pandas. By setting ascending=[True, False], I told Pandas exactly how to handle each column in the list.

Handle Missing Values During Multi-Column Sort

Real-world data is rarely perfect, especially when I’m working with large US financial reports. Sometimes, a column might have NaN (Not a Number) values.

I have found that the na_position parameter is essential here to decide if these empty cells should appear at the start or the end.

import pandas as pd
import numpy as np

# US Real Estate Data with missing Price information
data = {
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],
    'State': ['NY', 'CA', 'IL', 'TX', 'AZ'],
    'Avg_Home_Price': [850000, np.nan, 300000, 350000, np.nan]
}

df = pd.DataFrame(data)

# Sorting by State and Price, putting missing prices at the top
sorted_df = df.sort_values(by=['State', 'Avg_Home_Price'], na_position='first')

print("Sorted data with NAs at the beginning:")
print(sorted_df)

I executed the above example code and added the screenshot below.

I usually prefer putting NaN values at the end, but knowing how to move them to the top has saved me during data cleaning phases.

Sort and Resetting the Index

One thing I noticed early on is that sorting a DataFrame leaves the index (the row numbers) in a jumbled order.

If I want my new sorted DataFrame to have clean, sequential row numbers starting from zero, I use .reset_index().

import pandas as pd

# US Salary Data
data = {
    'Developer': ['Alice', 'Bob', 'Charlie', 'David'],
    'Role': ['Backend', 'Frontend', 'Backend', 'Frontend'],
    'Salary_USD': [120000, 95000, 130000, 110000]
}

df = pd.DataFrame(data)

# Sort by Role and Salary, then reset the index
sorted_df = df.sort_values(by=['Role', 'Salary_USD']).reset_index(drop=True)

print("Sorted and with a fresh index:")
print(sorted_df)

I always use drop=True inside reset_index(). If I don’t, Pandas will keep the old, messy index as a new column, which I rarely need.

Use a Custom Key for Sorting

Sometimes, alphabetical or numerical sorting isn’t enough. In a recent project involving US logistics, I had to sort data by “Priority” levels like “High”, “Medium”, and “Low”.

Standard sorting would put “High” before “Low” just because of the letter ‘H’, which is not what I wanted.

I often use the key parameter to apply a custom function to the column while sorting.

import pandas as pd

# US Shipping Data
data = {
    'Package_ID': ['A101', 'B202', 'C303', 'D404'],
    'Destination': ['Miami', 'Denver', 'Seattle', 'Boston'],
    'Priority': ['Low', 'High', 'Medium', 'High']
}

df = pd.DataFrame(data)

# Defining a custom sort order for Priority
priority_map = {'High': 0, 'Medium': 1, 'Low': 2}

# Sorting by Priority using the map as a key
sorted_df = df.sort_values(
    by=['Priority', 'Destination'], 
    key=lambda col: col.map(priority_map) if col.name == 'Priority' else col
)

print("Custom sorting by Priority levels:")
print(sorted_df)

This approach allows me to enforce a logical order that doesn’t follow standard rules.

Case Sensitivity in Multi-Column Sorting

I have encountered many situations where US city names were entered with inconsistent capitalization.

By default, Pandas sort_values() is case-sensitive, meaning “austin” would appear after “Washington”.

To fix this, I use a lambda function within the key parameter to convert strings to lowercase during the sort.

import pandas as pd

# US Cities with inconsistent casing
data = {
    'State': ['TX', 'TX', 'NY', 'NY'],
    'City': ['austin', 'Dallas', 'new york', 'Buffalo']
}

df = pd.DataFrame(data)

# Sorting by State and City (ignoring case)
sorted_df = df.sort_values(
    by=['State', 'City'], 
    key=lambda col: col.str.lower() if col.dtype == "object" else col
)

print("Case-insensitive multi-column sort:")
print(sorted_df)

I executed the above example code and added the screenshot below.

This ensures that my reports look professional and that no data point is misplaced due to a typing error.

Sort DataFrames In-Place

If I am working with a very large dataset and I want to save memory, I sometimes use the inplace=True parameter.

When I set inplace=True, Pandas modifies the original DataFrame instead of creating a new one.

import pandas as pd

# US Tech Stock Data
data = {
    'Ticker': ['AAPL', 'MSFT', 'TSLA', 'AMZN'],
    'Price': [150, 280, 700, 3300],
    'Volume': [100000, 85000, 120000, 90000]
}

df = pd.DataFrame(data)

# Modifying the original dataframe directly
df.sort_values(by=['Price', 'Volume'], ascending=False, inplace=True)

print("The original DataFrame has been updated:")
print(df)

I caution you to be careful with inplace=True because you lose the original order of your data forever once you run the command.

Sort by Index and Column Together

There are advanced scenarios where I have a DataFrame with a Multi-Index. In these cases, I might want to sort by an index level and a standard column at the same time.

I have found that the best way to handle this is to use sort_values just as before, as it supports index names.

import pandas as pd

# Creating a Multi-index US Sales Data
data = {
    'Sales': [5000, 7000, 2000, 9000]
}
index = pd.MultiIndex.from_tuples(
    [('West', 'LA'), ('West', 'SF'), ('East', 'NY'), ('East', 'Boston')],
    names=['Region', 'City']
)

df = pd.DataFrame(data, index=index)

# Sorting by Region (Index) and Sales (Column)
sorted_df = df.sort_values(by=['Region', 'Sales'], ascending=[True, False])

print("Sorting by both Index Level and Column Value:")
print(sorted_df)

This flexibility is why I consider Pandas the gold standard for data manipulation in Python. In this guide, I have covered the various ways I use Pandas to sort by multiple columns.

From basic ascending sorts to complex custom keys and handling missing data, these tools are essential for any data professional.

I recommend practicing these methods with your own datasets to see how they can streamline your workflow.

Sorting might seem like a simple task, but mastering it will make your data analysis much more powerful and accurate.

Pandas Sort by Multiple Columns

Why You Need to Sort by Multiple Columns

The Basic Syntax of sort_values()

Sort by Multiple Columns (Ascending Order)

Mix Sort Orders (Ascending and Descending)

Handle Missing Values During Multi-Column Sort

Sort and Resetting the Index

Use a Custom Key for Sorting

Case Sensitivity in Multi-Column Sorting

Sort DataFrames In-Place

Sort by Index and Column Together

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends