How to Rename Columns in Pandas

Last week, I was working on a dataset from the US Census Bureau, and the column names were a complete mess of cryptic codes and underscores.

Working with column names like “B01001_001E” is a nightmare when you are trying to build a clean analysis or a dashboard.

In my experience, the first step to any successful data project is making your DataFrame readable and intuitive.

In this tutorial, I will show you exactly how to rename columns in Pandas using the same methods I use in my professional projects every day.

This Tutorial Covers:

The Most Common Way: Use the rename() Method

The rename() function is my go-to tool because it is incredibly flexible and safe.

It allows you to change specific column names without having to worry about the order of the other columns in your DataFrame.

I find this particularly useful when I only need to fix one or two typos in a large dataset of US housing prices.

Here is the code to rename columns using a dictionary:

import pandas as pd

# Creating a sample dataset of US City Data
data = {
    'city_nm': ['New York City', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],
    'st_code': ['NY', 'CA', 'IL', 'TX', 'AZ'],
    'pop_2020': [8804190, 3898747, 2746388, 2304580, 1608139],
    'med_inc': [67046, 65290, 58247, 53600, 60914]
}

df = pd.DataFrame(data)

# I use a dictionary to map old names to new names
# This is the safest method as it won't affect other columns
df_renamed = df.rename(columns={
    'city_nm': 'City',
    'st_code': 'State',
    'pop_2020': 'Population',
    'med_inc': 'Median_Income_USD'
})

print("Original DataFrame Columns:")
print(df.columns)

print("\nRenamed DataFrame Columns:")
print(df_renamed.columns)

I executed the above example code and added the screenshot below.

In the example above, I passed a dictionary to the columns parameter. The keys represent the existing names, and the values represent the new names you want to assign.

Rename Columns In-Place

Sometimes, I don’t want to create a new variable for my DataFrame.

If I am working with a massive dataset, like the daily transactions of a US retail chain, I prefer to modify the DataFrame directly to save memory.

To do this, you can use the inplace=True parameter.

# Modifying the DataFrame directly
df.rename(columns={'city_nm': 'City Name'}, inplace=True)

print(df.columns)

However, a quick tip from my years of coding: many developers are moving away from inplace=True because it can make debugging harder.

I usually prefer reassigning the DataFrame back to itself, like df = df.rename(…).

Change All Column Names Using a List

There are times when the entire header row of a dataset is wrong.

I often see this when importing CSV files from older US government databases where the headers are just “Column 1”, “Column 2”, and so on.

If you know the exact order of the columns, you can replace them all at once by assigning a list to the .columns attribute.

# Let's say we have a fresh dataset of US Stock Market symbols
stocks_data = {
    'col1': ['AAPL', 'MSFT', 'AMZN', 'GOOGL'],
    'col2': [175.20, 420.50, 180.10, 150.30],
    'col3': ['Tech', 'Tech', 'Consumer', 'Tech']
}

df_stocks = pd.DataFrame(stocks_data)

# Directly assigning a new list of names
# Warning: The list must match the length of the columns exactly
df_stocks.columns = ['Ticker', 'Price_USD', 'Sector']

print(df_stocks.head())

I executed the above example code and added the screenshot below.

Be careful with this method. If the source data changes its structure and adds a new column, your code will break or, worse, assign the wrong names to the wrong data.

Use string methods for Mass Renaming

When I deal with datasets that have inconsistent formatting—like a mix of uppercase, lowercase, and spaces—I don’t rename them one by one.

Instead, I use Python’s built-in string methods through the str attribute. This is a lifesaver when you have 50+ columns and just want them all to look professional.

# Sample data with messy US Real Estate headers
messy_data = {
    'PROPERTY ADDRESS ': ['123 Maple St', '456 Oak Ave'],
    'zip code': [90210, 10001],
    'SALE PRICE ($)': [1200000, 850000]
}

df_messy = pd.DataFrame(messy_data)

# I prefer lowercase and underscores for easy typing in code
df_messy.columns = df_messy.columns.str.strip().str.lower().str.replace(' ', '_').str.replace('(', '').str.replace(')', '').str.replace('$', '')

print(df_messy.columns)
# Output: Index(['property_address', 'zip_code', 'sale_price_'], dtype='object')

I executed the above example code and added the screenshot below.

I find this approach ensures consistency across my entire data pipeline.

Rename with a Function (The Lambda Method)

If you need to apply a specific logic to every column name, you can pass a function to the rename() method.

I recently used this when I had to add a prefix to all columns in a dataset before merging it with another one.

# Adding a prefix to US Employment data
emp_data = {
    'Rate': [4.1, 3.9, 4.0],
    'Total': [150000, 152000, 151000]
}

df_emp = pd.DataFrame(emp_data)

# Using a lambda function to add 'US_' to every column
df_emp = df_emp.rename(columns=lambda x: 'US_' + x)

print(df_emp.columns)

This is much cleaner than writing a loop, and it keeps your code “Pandas-idiomatic.”

Use the set_axis() Method

Another method I occasionally use is set_axis().

While it feels similar to assigning a list to .columns, it is more functional and can be used in a “method chain” (calling multiple functions in one line).

# Sample US Sports Data
sports_data = {
    'A': ['Lakers', 'Celtics'],
    'B': ['Los Angeles', 'Boston']
}

df_sports = pd.DataFrame(sports_data)

# Using set_axis to rename columns
df_sports = df_sports.set_axis(['Team', 'Location'], axis=1)

print(df_sports.columns)

In my workflow, I use set_axis when building complex pipelines that require renaming columns right after a filter or a sort.

Rename Columns while Loading the Data

Prevention is often better than a cure. If I know the column names in my CSV file are terrible, I rename them as soon as I read the file.

When using pd.read_csv(), you can use the names parameter to provide your own headers and header=0 to tell Pandas to ignore the original header row.

# This is how I would load a US Weather station CSV with custom names
# df = pd.read_csv('us_weather.csv', header=0, names=['Date', 'Station_ID', 'Temp_F', 'Rainfall_Inches'])

This keeps your script clean and prevents the need for an extra renaming step later in your code.

Deal with Multi-Index Columns

In more advanced scenarios, such as when you use groupby and agg on US Sales data, you might end up with Multi-Index columns.

Renaming these can be tricky for beginners, but I usually flatten them or rename the levels directly.

# Creating a simple Multi-Index example
df_multi = pd.DataFrame({
    ('Sales', '2023'): [100, 200],
    ('Sales', '2024'): [150, 250]
})

# Renaming specific levels
df_multi.columns = [f"{col[0]}_{col[1]}" for col in df_multi.columns]

print(df_multi.columns)

By flattening the headers like this, I make the DataFrame much easier to work with for visualization tools like Matplotlib or Seaborn.

Renaming columns is a small task, but doing it correctly will save you hours of frustration down the road.

I always recommend using the rename() method with a dictionary for most cases, as it is the most explicit and readable.

If you are dealing with hundreds of columns, then string methods or lambda functions are your best friends.

The key is to keep your column names consistent, descriptive, and easy to type.

How to Rename Columns in Pandas

The Most Common Way: Use the rename() Method

Rename Columns In-Place

Change All Column Names Using a List

Use string methods for Mass Renaming

Rename with a Function (The Lambda Method)

Use the set_axis() Method

Rename Columns while Loading the Data

Deal with Multi-Index Columns

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends