How to Use @dataclass in Python

I have been writing Python for over a decade, and I remember the days when creating a simple data container felt like a chore.

You had to write the constructor, the string representation, and the equality logic every single time. It was tedious and prone to errors.

When Python 3.7 introduced the @dataclass decorator, it changed how I approach object-oriented programming. It handles the “boilerplate” code for you so you can focus on the logic.

In this guide, I will show you how to use dataclasses based on my years of experience building production-level systems.

What is a Python Dataclass?

A dataclass is a regular Python class that is decorated with @dataclass. This decorator tells Python to automatically generate special methods.

Specifically, it writes the __init__, __repr__, and __eq__ methods for you. This makes your code significantly shorter and easier to maintain.

Create Your First Dataclass

In my early projects, I used to write long classes for simple things like employee records. Now, I use a dataclass to keep things clean.

Suppose you are building a system for a US-based logistics company. You need to track shipments across different states.

from dataclasses import dataclass

@dataclass
class Shipment:
    tracking_id: str
    origin_city: str
    destination_state: str
    weight_lb: float
    is_express: bool = False

# Creating an instance
package = Shipment("TX-9981", "Austin", "New York", 12.5)

print(package)
# Output: Shipment(tracking_id='TX-9981', origin_city='Austin', destination_state='New York', weight_lb=12.5, is_express=False)

I executed the above example code and added the screenshot below.

@dataclass in Python

I find this incredibly helpful because the print() statement actually gives me readable output right away. With a normal class, you’d just see a memory address.

Add Default Values and Type Hints

One thing I love about dataclasses is how they enforce type hints. While Python doesn’t strictly block wrong types, hints make your IDE much smarter.

You can also provide default values, just like I did with is_express in the example above.

When I design systems for US retail apps, I often set default values for things like “currency” or “tax rate” to save time during instantiation.

Use Post-Initialization for Validation

Sometimes, simply assigning values isn’t enough. You might need to calculate a field or validate data right after the object is created.

I use the __post_init__ method for this. It runs automatically after the generated __init__ finishes.

Let’s say you are calculating the shipping cost for a package based on its weight and destination.

@dataclass
class USProduct:
    name: str
    price: float
    quantity: int
    total_value: float = 0.0

    def __post_init__(self):
        # Calculate total value after the object is initialized
        self.total_value = self.price * self.quantity
        
        # Validation: Ensure price is never negative
        if self.price < 0:
            raise ValueError(f"Price for {self.name} cannot be negative.")

laptop = USProduct("MacBook Pro", 1299.99, 2)
print(f"Total Inventory Value: ${laptop.total_value}")

I executed the above example code and added the screenshot below.

Use @dataclass in Python

In my experience, using __post_init__ is much cleaner than overriding the constructor manually.

Make Dataclasses Immutable with Frozen

There are times when you want to ensure that once a piece of data is set, it cannot be changed. This is common when handling financial transactions or fixed configuration settings.

You can do this by setting frozen=True in the decorator.

@dataclass(frozen=True)
class TaxRate:
    state: str
    rate: float

# This works fine
california_tax = TaxRate("CA", 0.0725)

# This will raise a FrozenInstanceError
# california_tax.rate = 0.08 

I personally use frozen dataclasses whenever I’m passing data between different parts of a large application to prevent accidental bugs.

Convert Dataclasses to Dictionaries or Tuples

When I work with APIs or databases, I often need to convert my objects into dictionaries. The dataclasses module provides built-in functions for this.

Instead of writing a custom to_dict() method, you can use asdict or astuple.

from dataclasses import asdict, astuple

@dataclass
class UserProfile:
    username: str
    email: str
    zip_code: int

user = UserProfile("jdoe_nyc", "jdoe@example.com", 10001)

# Convert to Dictionary
user_dict = asdict(user)
print(user_dict) # {'username': 'jdoe_nyc', 'email': 'jdoe@example.com', 'zip_code': 10001}

# Convert to Tuple
user_tuple = astuple(user)
print(user_tuple) # ('jdoe_nyc', 'jdoe@example.com', 10001)

This is a lifesaver when you need to serialize data into JSON format for a web service.

Compare Objects and Sorting

By default, dataclasses compare instances by looking at their fields. If all fields are the same, the objects are considered equal.

If you want to sort them (for example, sorting a list of employees by their salary), you can set order=True.

@dataclass(order=True)
class RealEstateListing:
    # We want to sort by price primarily, so it goes first
    price: int
    address: str
    sq_ft: int

house1 = RealEstateListing(450000, "123 Maple St, Seattle", 2100)
house2 = RealEstateListing(520000, "456 Oak Ave, Portland", 2500)

print(house1 < house2) # Returns True

When order=True is used, Python generates comparison methods like __lt__ and __gt__. It compares the fields in the order they are defined.

Inherit from Dataclasses

You can inherit from a dataclass just like a normal class. This is useful for creating specialized versions of a data model.

In a US payroll system, I might have a base Employee class and specialized subclasses for different roles.

@dataclass
class Employee:
    name: str
    employee_id: int

@dataclass
class Manager(Employee):
    department: str
    bonus_eligible: bool = True

mgr = Manager("Sarah Jenkins", 5005, "Engineering")
print(mgr)

One thing to watch out for: if the parent class has fields with default values, all subsequent fields in the child class must also have default values.

Using dataclasses has saved me countless hours of writing repetitive code. They make Python scripts more readable and help prevent the “forgot to update __repr__” bug.

Whether you are building a small script or a large-scale application, I highly recommend making @dataclass a standard part of your toolkit.

You may also like to read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.