How to Get Column Names in Pandas

I have spent a significant amount of time cleaning and analyzing data using Pandas.

One of the most frequent tasks I perform, and one that beginners often ask about, is simply identifying the names of the columns in a DataFrame.

Whether you are working with a small CSV or a massive dataset from a US government portal, knowing your headers is the first step in any data project.

In this tutorial, I’ll walk you through the various ways I retrieve column names in Pandas, using examples that reflect real-world scenarios.

Retrieve Column Names Using the .columns Attribute

The most direct way I access column names is by using the .columns attribute. It is fast, efficient, and built directly into the DataFrame object.

When I pull data from the US Bureau of Labor Statistics, I often use this to quickly verify that the data loaded correctly.

import pandas as pd

# Creating a dataset based on US Tech Hub employment data
data = {
    'City': ['San Francisco', 'Austin', 'Seattle', 'New York'],
    'Tech_Jobs_2023': [150000, 85000, 120000, 200000],
    'Average_Salary': [165000, 135000, 155000, 170000],
    'State': ['CA', 'TX', 'WA', 'NY']
}

df = pd.DataFrame(data)

# Accessing the columns attribute
column_headers = df.columns

print("The columns in our dataset are:")
print(column_headers)

I executed the above example code and added the screenshot below.

Get Column Names in Pandas

In this case, Pandas returns an Index object. This is useful because it is an optimized array specifically for labels.

Convert Column Names to a List

While the Index object is great, I often find that I need a standard Python list to iterate through or to use in a custom function.

I prefer using the .tolist() method for this because it is explicit and very readable for anyone else reviewing my code.

import pandas as pd

# Data representing US National Park annual visitors
park_data = {
    'Park_Name': ['Great Smoky Mountains', 'Grand Canyon', 'Zion', 'Yellowstone'],
    'Location_State': ['TN', 'AZ', 'UT', 'WY'],
    'Annual_Visitors': [12900000, 4700000, 4600000, 3200000],
    'Established_Year': [1934, 1919, 1919, 1872]
}

df = pd.DataFrame(park_data)

# Converting the Index object to a standard Python list
cols_list = df.columns.tolist()

print("Columns as a Python list:")
print(cols_list)

# Now I can easily loop through them
for col in cols_list:
    print(f"Processing column: {col}")

I executed the above example code and added the screenshot below.

How to Get Column Names in Pandas

Using a list is my “go-to” when I am building dynamic reports where the number of columns might change month-to-month.

Use the list() Constructor

If you are looking for a shorter way to get a list, you can simply wrap the DataFrame in the standard Python list() function.

In my experience, this is the quickest “shorthand” method, though it lacks the descriptive nature of .tolist().

import pandas as pd

# US Real Estate Market Data
housing_data = {
    'Zip_Code': ['90210', '10001', '33101', '60601'],
    'Median_Home_Price': [3500000, 1200000, 550000, 480000],
    'Property_Tax_Rate': [0.007, 0.019, 0.011, 0.021]
}

df = pd.DataFrame(housing_data)

# Simple list conversion
column_names = list(df)

print(column_names)

I executed the above example code and added the screenshot below.

Get Column Names in Python Pandas

I use this when I’m doing quick exploratory data analysis (EDA) in a Jupyter Notebook and don’t want to type extra characters.

Get Column Names via Iteration

Sometimes I only need the column names to perform a specific action on each header, like stripping whitespace or changing the case.

You can iterate directly over the DataFrame, which by default iterates over the column labels.

import pandas as pd

# US Retail Sales Data
sales_data = {
    'Store_ID': ['S101', 'S102', 'S103'],
    'Monthly_Revenue': [45000, 62000, 31000],
    'Holiday_Season_Sale': [True, False, True]
}

df = pd.DataFrame(sales_data)

# Iterating to print names
print("Headers found in sales report:")
for col in df:
    print(col)

This is a clean way to handle headers without creating a separate variable to store the list.

Retrieve Columns Sorted Alphabetically

When dealing with large datasets, such as US Census Bureau CSVs with hundreds of columns, I find it helpful to see them in alphabetical order.

I achieve this by wrapping the column attribute in the sorted() function.

import pandas as pd

# Multi-column US Automotive Data
car_data = {
    'Model': ['Ford F-150', 'Tesla Model 3', 'Chevrolet Silverado'],
    'Manufacturer': ['Ford', 'Tesla', 'GM'],
    'Type': ['Truck', 'Sedan', 'Truck'],
    'Fuel_Source': ['Gasoline', 'Electric', 'Gasoline'],
    'MSRP': [35000, 40000, 38000]
}

df = pd.DataFrame(car_data)

# Getting sorted column names
sorted_cols = sorted(df.columns)

print("Alphabetical list of columns:")
print(sorted_cols)

Sorting helps me quickly identify if a specific variable is present when the original file order is chaotic.

Access Column Names with .keys()

If you come from a background of working with Python dictionaries, you might find the .keys() method more intuitive.

Pandas DataFrames share many behaviors with dictionaries, and calling .keys() returns the same Index object as .columns.

import pandas as pd

# US Fortune 500 Snippet
company_data = {
    'Rank': [1, 2, 3],
    'Company': ['Walmart', 'Amazon', 'Apple'],
    'Revenue_Billions': [611.3, 514.0, 394.3]
}

df = pd.DataFrame(company_data)

# Using the keys method
print(df.keys())

I rarely use this in production code, but it is a great “sanity check” if you are thinking of your DataFrame as a collection of series.

Use df.info() for a Structural Overview

When I first load a dataset, like a list of US hospital locations, I don’t just want the names; I want the data types and the count of non-null values.

The .info() method is my favorite way to get a bird’s-eye view of all columns simultaneously.

import pandas as pd

# US Healthcare Data
health_data = {
    'Hospital_Name': ['Mayo Clinic', 'Cleveland Clinic', 'Johns Hopkins'],
    'City': ['Rochester', 'Cleveland', 'Baltimore'],
    'Beds_Available': [1200, 1300, 1100],
    'Rating': [4.9, 4.8, 4.9]
}

df = pd.DataFrame(health_data)

# Displaying full structural info
df.info()

While this prints to the console rather than returning a list, it is indispensable for understanding the context of your column names.

Filter Column Names by Data Type

In many of my machine learning projects, I need to isolate only the numerical columns to perform calculations.

I use the select_dtypes method combined with .columns to get only the headers I need.

import pandas as pd

# US Stock Market Portfolio
stock_data = {
    'Ticker': ['AAPL', 'MSFT', 'GOOGL'],
    'Price': [180.50, 405.20, 145.10],
    'Volume': [50000000, 25000000, 18000000],
    'Sector': ['Tech', 'Tech', 'Tech']
}

df = pd.DataFrame(stock_data)

# Get only names of numerical columns
numeric_cols = df.select_dtypes(include=['number']).columns.tolist()

print("Numerical Columns for analysis:")
print(numeric_cols)

This prevents errors when trying to run mathematical operations on string-based columns like “Ticker” or “Sector.”

I hope you found this guide helpful. Being able to quickly grab and manipulate column names is a fundamental skill that will save you a lot of time in your data journey.

Whether you are just starting with Python or you are a seasoned pro, these methods are the building blocks for more complex data engineering.

You may read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.