Recently, I was working on a data analytics project where I needed to convert a Pandas DataFrame into a nested dictionary format for a JSON API endpoint. The challenge was that I needed a specific structure with multiple levels of nesting.
Fortunately, Pandas offers several flexible methods to transform DataFrames into nested dictionaries that can accommodate various use cases.
In this article, I’ll share four practical methods to convert a DataFrame to a nested dictionary in Python, with real examples you can implement them right away.
Nested Dictionary in Python
A nested dictionary is simply a dictionary in Python that contains other dictionaries as values. It allows you to create hierarchical data structures with multiple levels.
For example, a nested dictionary might look like this:
{
'California': {
'population': 39.5,
'capital': 'Sacramento',
'largest_city': 'Los Angeles'
},
'Texas': {
'population': 29.1,
'capital': 'Austin',
'largest_city': 'Houston'
}
}Read Convert a Pandas DataFrame to a Dict Without Index in Python
Convert A DataFrame To A Nested Dictionary In Python
Now, I will explain to you the methods to convert a DataFrame to a nested dictionary in Python.
Method 1: Use to_dict() with orient=’index’
The simplest way to convert a DataFrame to a nested dictionary is to use the Python built-in method to_dict() with the orient='index' parameter.
Let’s say we have a DataFrame containing US sales data:
import pandas as pd
# Sample US sales data
data = {
'Region': ['Northeast', 'Southeast', 'Midwest', 'West'],
'Q1_Sales': [250000, 175000, 300000, 225000],
'Q2_Sales': [310000, 190000, 340000, 280000],
'Growth': [0.15, 0.05, 0.12, 0.18]
}
df = pd.DataFrame(data)
print(df)To convert this DataFrame to a nested dictionary with regions as the primary keys:
# Convert DataFrame to nested dictionary with region as the key
nested_dict = df.set_index('Region').to_dict(orient='index')
print(nested_dict)The output will be:
{
'Northeast': {'Q1_Sales': 250000, 'Q2_Sales': 310000, 'Growth': 0.15},
'Southeast': {'Q1_Sales': 175000, 'Q2_Sales': 190000, 'Growth': 0.05},
'Midwest': {'Q1_Sales': 300000, 'Q2_Sales': 340000, 'Growth': 0.12},
'West': {'Q1_Sales': 225000, 'Q2_Sales': 280000, 'Growth': 0.18}
}I executed the above example code and added the screenshot below.

This method is perfect when you want to use one of your DataFrame columns as the primary key for your nested dictionary.
Check out Convert a Pandas DataFrame to a List in Python
Method 2: Create Multi-Level Nested Dictionaries
Sometimes you need a more complex nested structure with multiple levels. Let’s say we have a DataFrame with detailed US sales data by state and product:
# More detailed US sales data
data = {
'State': ['California', 'California', 'Texas', 'Texas', 'New York', 'New York'],
'Product': ['Laptops', 'Phones', 'Laptops', 'Phones', 'Laptops', 'Phones'],
'Sales': [500000, 750000, 300000, 450000, 425000, 575000],
'Units': [2000, 3000, 1200, 1800, 1700, 2300]
}
df = pd.DataFrame(data)
print(df)To create a nested dictionary with states as the primary keys and products as secondary keys:
# Create a multi-level nested dictionary
nested_dict = {}
for _, row in df.iterrows():
state = row['State']
product = row['Product']
if state not in nested_dict:
nested_dict[state] = {}
nested_dict[state][product] = {
'Sales': row['Sales'],
'Units': row['Units']
}
print(nested_dict)This will produce:
{
'California': {
'Laptops': {'Sales': 500000, 'Units': 2000},
'Phones': {'Sales': 750000, 'Units': 3000}
},
'Texas': {
'Laptops': {'Sales': 300000, 'Units': 1200},
'Phones': {'Sales': 450000, 'Units': 1800}
},
'New York': {
'Laptops': {'Sales': 425000, 'Units': 1700},
'Phones': {'Sales': 575000, 'Units': 2300}
}
}I executed the above example code and added the screenshot below.

This method gives you full control over the structure of your nested dictionary.
Read Add Rows to a DataFrame Pandas in a Loop in Python
Method 3: Use groupby() for Nested Dictionaries
Another efficient approach is to use Pandas’ groupby() function to structure your nested dictionary:
# Using groupby to create a nested dictionary
grouped = df.groupby(['State', 'Product']).apply(lambda x: x[['Sales', 'Units']].to_dict('records')[0])
nested_dict = grouped.to_dict()
print(nested_dict)This will produce a dictionary with tuple keys:
{
('California', 'Laptops'): {'Sales': 500000, 'Units': 2000},
('California', 'Phones'): {'Sales': 750000, 'Units': 3000},
('Texas', 'Laptops'): {'Sales': 300000, 'Units': 1200},
('Texas', 'Phones'): {'Sales': 450000, 'Units': 1800},
('New York', 'Laptops'): {'Sales': 425000, 'Units': 1700},
('New York', 'Phones'): {'Sales': 575000, 'Units': 2300}
}I executed the above example code and added the screenshot below.

To restructure this into our desired nested format:
# Convert from tuple keys to nested dictionary
restructured_dict = {}
for (state, product), values in nested_dict.items():
if state not in restructured_dict:
restructured_dict[state] = {}
restructured_dict[state][product] = values
print(restructured_dict)This method leverages Pandas’ powerful groupby capabilities for more complex aggregations.
Check out Convert Python Dictionary to Pandas DataFrame
Method 4: Use to_dict() with orient=’records’ and Further Processing
For more flexibility, we can use to_dict() with orient='records' and then further process the data:
# Using to_dict with orient='records'
records = df.to_dict(orient='records')
# Processing the records into a nested dictionary
nested_dict = {}
for record in records:
state = record.pop('State')
product = record.pop('Product')
if state not in nested_dict:
nested_dict[state] = {}
nested_dict[state][product] = record
print(nested_dict)This produces the same result as our earlier methods but offers more flexibility in how you structure your final dictionary.
Read Pandas str.replace Multiple Values in Python
Custom Nesting for Specific Business Requirements
In real-world applications, you might need very specific nesting structures. Let’s say we want to organize our US retail data by region, then state, then city:
# More complex US retail data
data = {
'Region': ['West', 'West', 'South', 'South', 'Northeast', 'Northeast'],
'State': ['California', 'California', 'Texas', 'Texas', 'New York', 'New York'],
'City': ['Los Angeles', 'San Francisco', 'Houston', 'Dallas', 'NYC', 'Buffalo'],
'Revenue': [1200000, 950000, 850000, 780000, 1500000, 420000],
'Customers': [12000, 9500, 8500, 7800, 15000, 4200]
}
df = pd.DataFrame(data)
# Creating a three-level nested dictionary
nested_retail = {}
for _, row in df.iterrows():
region = row['Region']
state = row['State']
city = row['City']
if region not in nested_retail:
nested_retail[region] = {}
if state not in nested_retail[region]:
nested_retail[region][state] = {}
nested_retail[region][state][city] = {
'Revenue': row['Revenue'],
'Customers': row['Customers']
}
print(nested_retail)This creates a three-level nested dictionary that might be perfect for a regional sales dashboard or API.
Check out Pandas Find Duplicates in Python
Handle Complex DataFrames with MultiIndex
When working with complex data structures, you might have a DataFrame with a MultiIndex. Here’s how to handle that:
# Creating a DataFrame with MultiIndex
index = pd.MultiIndex.from_tuples([
('2022', 'Q1', 'East'),
('2022', 'Q1', 'West'),
('2022', 'Q2', 'East'),
('2022', 'Q2', 'West'),
('2023', 'Q1', 'East'),
('2023', 'Q1', 'West')
], names=['Year', 'Quarter', 'Region'])
data = {
'Sales': [250000, 320000, 280000, 350000, 310000, 380000],
'Units': [1000, 1280, 1120, 1400, 1240, 1520]
}
multi_df = pd.DataFrame(data, index=index)
print(multi_df)
# Converting to nested dictionary
nested_from_multi = multi_df.to_dict(orient='index')
# Restructuring for better nesting
restructured = {}
for (year, quarter, region), values in nested_from_multi.items():
if year not in restructured:
restructured[year] = {}
if quarter not in restructured[year]:
restructured[year][quarter] = {}
restructured[year][quarter][region] = values
print(restructured)This approach is particularly useful for time series data with multiple dimensions.
I hope you found this article helpful for converting Python Pandas DataFrames to nested dictionaries. Each method has its strengths, and the best choice depends on your specific requirements and the structure of your data.
Remember that nested dictionaries are incredibly versatile for representing hierarchical data and are perfect for tasks like creating JSON responses for APIs, organizing complex datasets, or preparing data for visualization tools.
Other Python articles you may also like:
- np.where in Pandas Python
- Pandas GroupBy Without Aggregation Function in Python
- Pandas Merge Fill NAN with 0 in Python

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.