Convert a DataFrame to JSON in Python (6 Methods)

Recently, I was working on a data processing project where I needed to send pandas DataFrame data to a web API. The issue is, I had to convert my DataFrame to JSON format first. This is a common requirement when working with APIs, building web applications, or simply storing data in a universally readable format.

In this article, I will cover several easy methods to convert a DataFrame to JSON in Python. I’ll also explain the different parameters you can use to customize your JSON output.

So let’s dive in..

Method 1 – Use the to_json() Method

The simplest way to convert a DataFrame to JSON is by using the built-in to_json() method in Python. This method comes with pandas and doesn’t require any additional libraries.

Here’s a basic example:

import pandas as pd

# Create a sample DataFrame with US sales data
data = {
    'Product': ['Laptop', 'Smartphone', 'Tablet', 'Headphones'],
    'Price': [1299.99, 899.99, 599.99, 249.99],
    'State': ['California', 'New York', 'Texas', 'Florida']
}

df = pd.DataFrame(data)

# Convert DataFrame to JSON
json_result = df.to_json()
print(json_result)

This will output a JSON string with the default ‘columns’ orientation:

{"Product":{"0":"Laptop","1":"Smartphone","2":"Tablet","3":"Headphones"},"Price":{"0":1299.99,"1":899.99,"2":599.99,"3":249.99},"State":{"0":"California","1":"New York","2":"Texas","3":"Florida"}}

You can refer to the screenshot below to see the output.

dataframe to json

Method 2 – Customize JSON Output with Orientation

Python’s to_json() method allows you to specify different orientations for your JSON output. This can be very useful depending on how you want to structure your data.

# Convert DataFrame to JSON with records orientation
json_records = df.to_json(orient='records')
print(json_records)

# Convert DataFrame to JSON with split orientation
json_split = df.to_json(orient='split')
print(json_split)

# Convert DataFrame to JSON with index orientation
json_index = df.to_json(orient='index')
print(json_index)

The ‘records’ orientation is particularly useful as it creates an array of objects, which is a common format for APIs:

[{"Product":"Laptop","Price":1299.99,"State":"California"},{"Product":"Smartphone","Price":899.99,"State":"New York"},{"Product":"Tablet","Price":599.99,"State":"Texas"},{"Product":"Headphones","Price":249.99,"State":"Florida"}]

Read Convert DataFrame To NumPy Array Without Index in Python

Method 3 – Pretty Printing JSON Output

When working with JSON, it’s often helpful to format it in a more readable way. You can do this by setting the indent parameter:

# Convert DataFrame to JSON with pretty printing
json_pretty = df.to_json(orient='records', indent=4)
print(json_pretty)

This will output nicely formatted JSON:

[
    {
        "Product": "Laptop",
        "Price": 1299.99,
        "State": "California"
    },
    {
        "Product": "Smartphone",
        "Price": 899.99,
        "State": "New York"
    },
    {
        "Product": "Tablet",
        "Price": 599.99,
        "State": "Texas"
    },
    {
        "Product": "Headphones",
        "Price": 249.99,
        "State": "Florida"
    }
]

Method 4 – Save JSON to a File

Often, you’ll want to save your JSON output to a file instead of just printing it. Here’s how to do that:

# Save JSON directly to a file
df.to_json('us_sales_data.json', orient='records', indent=4)

# Alternative method using file handle
with open('us_sales_data.json', 'w') as f:
    f.write(df.to_json(orient='records', indent=4))

Method 5 – Convert Specific DataFrame Columns to JSON

Sometimes you only need to convert certain columns to JSON:

# Convert only specific columns to JSON
selected_columns = df[['Product', 'Price']]
json_selected = selected_columns.to_json(orient='records')
print(json_selected)

This outputs:

[{"Product":"Laptop","Price":1299.99},{"Product":"Smartphone","Price":899.99},{"Product":"Tablet","Price":599.99},{"Product":"Headphones","Price":249.99}]

You can refer to the screenshot below to see the output.

pandas dataframe to json

Method 6 – Use json_normalize for Nested JSON

When dealing with more complex data structures, the json_normalize() function in Python can be extremely helpful:

from pandas import json_normalize
import json

# Create nested data
nested_data = [
    {
        "product": "Laptop",
        "price": 1299.99,
        "location": {
            "state": "California",
            "city": "San Francisco"
        },
        "specs": ["16GB RAM", "1TB SSD", "i7 Processor"]
    },
    {
        "product": "Smartphone",
        "price": 899.99,
        "location": {
            "state": "New York",
            "city": "New York City"
        },
        "specs": ["8GB RAM", "256GB Storage", "5G"]
    }
]

# First convert to DataFrame
df_nested = json_normalize(nested_data)
print(df_nested)

# Then convert back to JSON
json_from_nested = df_nested.to_json(orient='records')
print(json_from_nested)

This is particularly useful when working with complex API responses that contain nested objects.

Date Handling in JSON

One important consideration when working with DataFrames and JSON is date handling:

import pandas as pd
from datetime import datetime

# Create a DataFrame with dates
date_data = {
    'Product': ['Laptop', 'Smartphone'],
    'PurchaseDate': [datetime(2023, 1, 15), datetime(2023, 2, 20)]
}

df_dates = pd.DataFrame(date_data)

# Default date handling
json_dates_default = df_dates.to_json(orient='records')
print(json_dates_default)

# ISO date format
json_dates_iso = df_dates.to_json(orient='records', date_format='iso')
print(json_dates_iso)

The date_format parameter allows you to control how dates are represented in your JSON output.

Work with NaN Values

Another common issue is handling NaN (Not a Number) values when converting to JSON:

import numpy as np

# Create a DataFrame with NaN values
nan_data = {
    'Product': ['Laptop', 'Smartphone', 'Tablet'],
    'Price': [1299.99, np.nan, 599.99]
}

df_nan = pd.DataFrame(nan_data)

# Default NaN handling (converts to null)
json_nan = df_nan.to_json(orient='records')
print(json_nan)

By default, NaN values are converted to null in the JSON output.

Both methods work great, the standard to_json() method is quick and versatile for most use cases, while more complex scenarios might require additional parameters or the json_normalize() function for nested structures.

I hope you found this article helpful. Converting DataFrames to JSON is a fundamental skill for anyone working with data in Python, especially when building web applications or working with APIs. These methods should cover most of your JSON conversion needs!

Other Python articles you may also like:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.