dataclasses
The Python 3.7 release saw a new feature introduced: dataclasses
.
For reference, a class is basically a blueprint for creating objects. An example of a class could be a country, which we would use the Country
class to create various instances, such as Monaco and Gambia.
When initializing values, the properties supplied to the constructor (like population, languages, and so on) are copied into each object instance:
class Country: def __init__(self, name: str, population: int, continent: str, official_lang: str): self.name = name self.population = population self.continent = continent self.official_lang = official_lang smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa")
If you ever worked with object-oriented programming (OOP) in programming languages like Java and Python, then you should already be familiar with classes.
A dataclass
, however, comes with the basic class functionalities already implemented, decreasing the time spent writing code.
In this article, we’ll delve further into what dataclasses
in Python are, how to manipulate object fields, how to sort and compare dataclasses
, and more.
Note that because this was released in Python 3.7, you must have a recent version of Python installed on your local machine to use it.
dataclass
?As mentioned previously, Python dataclasses
are very similar to normal classes, but with implemented class functionalities that significantly decrease the amount of boilerplate code required to write.
An example of such boilerplate is the __init__
method.
In the Country
class example, you can observe that we had to manually define the __init__
method, which gets called when you initialize the class. Now, for every normal class you define, you are required to provide this function, which means you must write a lot of repetitive code.
The Python dataclass
comes with this method already defined. So, you can write the same Country
class without manually defining a constructor.
Under the hood, @dataclass
calls this method when you initialize the object with new properties.
Note that __init__
is not the only method provided by default. Other utility methods like __repr__
(representation), __lt__
(less than), __gt__
(greater than), __eq__
(equal to), and many others are also implemented by default.
When working with a normal class in Python, we have longer code to implement the base methods.
Consider the Country
class again. In the code block below, you can see a couple of methods, starting with the __innit__
method. This method initializes attributes like the country name, population count, continent, and official language on a Country
instance.
__repr__
returns the string representation of a class instance. This prints the attributes of each class instance in a string form.
_lt_
compares the population of two Country
instances and returns True
if the present instance has a lesser population, while _eq_
returns True
if they both have the same population count:
class Country: def __init__(self, name: str, population: int, continent: str, official_lang: str="English" ): self.name = name self.population = population self.continent = continent self.official_lang= official_lang def __repr__(self): return(f"Country(name={self.name}, population={self.population}, continent={self.continent}, official_lang={self.official_lang})") def __lt__(self, other): return self.population < other.population def __eq__(self, other): return self.population == other.population smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi") print(smallestAfrica) # Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English') print(smallestAsia < smallestAfrica) # True print(smallestAsia > smallestAfrica) # False
dataclass
To use Python’s dataclass
in your code, simply import the module and register the @dataclass
decorator on top of the class. This injects the base class functionalities into our class automatically.
In the following example, we’ll create the same Country
class, but with far less code:
from dataclasses import dataclass @dataclass(order=True) class Country: name: str population: int continent: str official_lang: str smallestAfrica= Country("Gambia", 2521126, "Africa", "English") smallestEurope = Country("Monaco", 37623, "Europe", "French") smallestAsia1= Country("Maldives", 552595, "Asia", "Dhivehi") smallestAsia2= Country("Maldives", 552595, "Asia", "Dhivehi") # Country(name='Gambia', population=2521126, continent='Africa', #official_lang='English') print(smallestAsia1 == smallestAsia2) # True print(smallestAsia < smallestAfrica) # False
Observe that we didn’t define a constructor method on the dataclass
; we just defined the fields.
We also omitted helpers like repr
and __eq__
. Despite the omission of these methods, the class still runs normally.
Note that for less than (<
), dataclass
uses the default method for comparing objects. Later on in this article, we will learn how to customize object comparison for better results.
field()
functionThe dataclass
module also provides a function called field()
. This function gives you ingrained control over the class fields, allowing you to manipulate and customize them as you wish.
For example, we can exclude the continent
field when calling the representation method by passing it a repr
parameter and setting the value to false
:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str = field(repr=False) # omits the field official_lang: str smallestEurope = Country("Monaco", 37623, "Europe", "French") print(smallestEurope) # Country(name='Monaco', population=37623, official_lang='French')
This code then outputs in the CLI:
By default, repr
is always set to True
Here are some other parameters that can be taken in by field()
.
init
parameterThe init
parameter passes to specify whether an attribute should be included as an argument to the constructor during initialization. If you set a field to innit=False
, then you must omit the attribute during initialization. Otherwise, a TypeError
will be thrown:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str official_lang: str = field(init=False) #Do not pass in this attribute in the constructor argument smallestEurope = Country("Monaco", 37623, "Europe", "English") #But you did, so error! print(smallestEurope)
This code then outputs in the CLI:
default
parameterThe default
parameter is passed to specify a default value for a field in case a value is not provided during initialization:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str official_lang: str = field(default="English") # If you ommit value, English will be used smallestEurope = Country("Monaco", 37623, "Europe") #Omitted, so English is used print(smallestEurope)
This code then outputs in the CLI:
repr
parameterThe repr
parameter passes to specify if the field should be included (repr=True
) or excluded (repr=False
) from the string representation, as generated by the __repr__
method:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str official_lang: str = field(repr=False) # This field will be excluded from string representation smallestEurope = Country("Monaco", 37623, "Europe", "French") print(smallestEurope)
This code then outputs in the CLI:
__post_init__
The __post_init__
method is called just after initialization. In other words, it is called after the object receives values for its fields, such as name
, continent
, population
, and official_lang
.
For example, we will use the method to determine if we are going to migrate to a country or not, based on the country’s official language:
from dataclasses import dataclass, field @dataclass class Country: name: str population: int continent: str = field(repr=False) # Excludes the continent field from string representation will_migrate: bool = field(init=False) # Initialize without will_migrate attribute official_lang: str = field(default="English") # Sets default language. Attributes with default values must appear last def __post_init__(self): if self.official_lang == "English": self.will_migrate == True else: self.will_migrate == False
After the object initializes with values, we perform a check to see if the official_lang
field is set to English
from inside post_init
. If so, we must set the will_migrate
property to true
. Otherwise, we set it to false
.
dataclasses
with sort_index
Another functionality of dataclasses
is the ability to create a custom order for comparing objects and sorting lists of objects.
For example, we can compare two countries by their population numbers. In other words, we want to say that one country is greater than another country if, and only if, its population count is greater than the other:
from dataclasses import dataclass, field @dataclass(order=True) class Country: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") #Sets default value for official language def __post_init__(self): self.sort_index = self.population smallestEurope = Country("Monaco", 37623, "Europe") smallestAsia= Country("Maldives", 552595, "Asia") smallestAfrica= Country("Gambia", 2521126, "Africa") print(smallestAsia < smallestAfrica) # True print(smallestAsia > smallestAfrica) # False
To enable comparison and sorting in a Python dataclass
, you must pass the order
property to @dataclass
with the true
value. This enables the default comparison functionality.
Since we want to compare by population count, we must pass the population
field to the sort_index
property after initialization from inside the __post_innit__
method.
You can also sort a list of objects using a particular field as the sort_index
. For example, we must sort a list of countries by their population count:
from dataclasses import dataclass, field @dataclass(order=True) class Country: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population europe = Country("Monaco", 37623, "Europe", "French") asia = Country("Maldives", 552595, "Asia", "Dhivehi") africa = Country("Gambia", 2521126, "Africa", "English") sAmerica = Country("Suriname", 539000, "South America", "Dutch") nAmerica = Country("St Kits and Nevis", 55345, "North America", "English") oceania = Country("Nauru", 11000, "Oceania", "Nauruan") mylist = [europe, asia, africa, sAmerica, nAmerica, oceania] mylist.sort() print(mylist) # This will return a list of countries sorted by population count, as shown below
This code then outputs in the CLI:
Don’t want the dataclass
to be tampered with? You can freeze the class by simply passing a frozen=True
value to the decorator:
from dataclasses import dataclass, field @dataclass(order=True, frozen=True) class Country: sort_index: int = field(init=False) name: str population: int = field(repr=True) continent: str official_lang: str = field(default="English") def __post_init__(self): self.sort_index = self.population
A Python dataclass
is a very powerful feature that drastically reduces the amount of code in class definitions. The module provides most of the basic class methods already implemented. You can customize the fields in a dataclass
and restrict certain actions.
Install LogRocket via npm or script tag. LogRocket.init()
must be called client-side, not
server-side
$ npm i --save logrocket // Code: import LogRocket from 'logrocket'; LogRocket.init('app/id');
// Add to your HTML: <script src="https://cdn.lr-ingest.com/LogRocket.min.js"></script> <script>window.LogRocket && window.LogRocket.init('app/id');</script>
Hey there, want to help make our blog better?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up nowCompare Prisma and Drizzle ORMs to learn their differences, strengths, and weaknesses for data access and migrations.
It’s easy for devs to default to JavaScript to fix every problem. Let’s use the RoLP to find simpler alternatives with HTML and CSS.
Learn how to manage memory leaks in Rust, avoid unsafe behavior, and use tools like weak references to ensure efficient programs.
Bypass anti-bot measures in Node.js with curl-impersonate. Learn how it mimics browsers to overcome bot detection for web scraping.