cattrs
cattrs is an open source Python library for structuring and unstructuring data. cattrs works best with attrs classes, dataclasses and the usual Python collections, but other kinds of classes are supported by manually registering converters.
Python has a rich set of powerful, easy to use, built-in data types like dictionaries, lists and tuples. These data types are also the lingua franca of most data serialization libraries, for formats like json, msgpack, cbor, yaml or toml.
Data types like this, and mappings like dict
s in particular, represent
unstructured data. Your data is, in all likelihood, structured: not all
combinations of field names or values are valid inputs to your programs. In
Python, structured data is better represented with classes and enumerations.
attrs is an excellent library for declaratively describing the structure of
your data, and validating it.
When you're handed unstructured data (by your network, file system, database...), cattrs helps to convert this data into structured data. When you have to convert your structured data into data types other libraries can handle, cattrs turns your classes and enumerations into dictionaries, integers and strings.
Here's a simple taste. The list containing a float, an int and a string gets converted into a tuple of three ints.
>>> import cattrs
>>> cattrs.structure([1.0, 2, "3"], tuple[int, int, int])
(1, 2, 3)
cattrs works well with attrs classes out of the box.
>>> from attrs import frozen
>>> import cattrs
>>> @frozen # It works with non-frozen classes too.
... class C:
... a: int
... b: str
>>> instance = C(1, 'a')
>>> cattrs.unstructure(instance)
{'a': 1, 'b': 'a'}
>>> cattrs.structure({'a': 1, 'b': 'a'}, C)
C(a=1, b='a')
Here's a much more complex example, involving attrs
classes with type
metadata.
>>> from enum import unique, Enum
>>> from typing import Optional, Sequence, Union
>>> from cattrs import structure, unstructure
>>> from attrs import define, field
>>> @unique
... class CatBreed(Enum):
... SIAMESE = "siamese"
... MAINE_COON = "maine_coon"
... SACRED_BIRMAN = "birman"
>>> @define
... class Cat:
... breed: CatBreed
... names: Sequence[str]
>>> @define
... class DogMicrochip:
... chip_id = field() # Type annotations are optional, but recommended
... time_chipped: float = field()
>>> @define
... class Dog:
... cuteness: int
... chip: Optional[DogMicrochip] = None
>>> p = unstructure([Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)),
... Cat(breed=CatBreed.MAINE_COON, names=('Fluffly', 'Fluffer'))])
>>> print(p)
[{'cuteness': 1, 'chip': {'chip_id': 1, 'time_chipped': 10.0}}, {'breed': 'maine_coon', 'names': ('Fluffly', 'Fluffer')}]
>>> print(structure(p, list[Union[Dog, Cat]]))
[Dog(cuteness=1, chip=DogMicrochip(chip_id=1, time_chipped=10.0)), Cat(breed=<CatBreed.MAINE_COON: 'maine_coon'>, names=['Fluffly', 'Fluffer'])]
Consider unstructured data a low-level representation that needs to be converted
to structured data to be handled, and use structure
. When you're done,
unstructure
the data to its unstructured form and pass it along to another
library or module. Use attrs type metadata
to add type metadata to attributes, so cattrs will know how to structure and
destructure them.
- Free software: MIT license
- Documentation: https://catt.rs
- Python versions supported: 3.8 and up. (Older Python versions are supported by older versions; see the changelog.)
Features
-
Converts structured data into unstructured data, recursively:
- attrs classes and dataclasses are converted into dictionaries in a way similar to
attrs.asdict
, or into tuples in a way similar toattrs.astuple
. - Enumeration instances are converted to their values.
- Other types are let through without conversion. This includes types such as integers, dictionaries, lists and instances of non-attrs classes.
- Custom converters for any type can be registered using
register_unstructure_hook
.
- attrs classes and dataclasses are converted into dictionaries in a way similar to
-
Converts unstructured data into structured data, recursively, according to your specification given as a type. The following types are supported:
-
typing.Optional[T]
. -
typing.List[T]
,typing.MutableSequence[T]
,typing.Sequence[T]
(converts to a list). -
typing.Tuple
(both variants,Tuple[T, ...]
andTuple[X, Y, Z]
). -
typing.MutableSet[T]
,typing.Set[T]
(converts to a set). -
typing.FrozenSet[T]
(converts to a frozenset). -
typing.Dict[K, V]
,typing.MutableMapping[K, V]
,typing.Mapping[K, V]
(converts to a dict). -
typing.TypedDict
. -
attrs classes with simple attributes and the usual
__init__
.- Simple attributes are attributes that can be assigned unstructured data, like numbers, strings, and collections of unstructured data.
-
All attrs classes and dataclasses with the usual
__init__
, if their complex attributes have type metadata. -
typing.Union
s of supported attrs classes, given that all of the classes have a unique field. -
typing.Union
s of anything, given that you provide a disambiguation function for it. -
Custom converters for any type can be registered using
register_structure_hook
.
-
cattrs comes with preconfigured converters for a number of serialization libraries, including json, msgpack, cbor2, bson, yaml and toml. For details, see the cattrs.preconf package.
Design Decisions
cattrs is based on a few fundamental design decisions.
- Un/structuring rules are separate from the models.
This allows models to have a one-to-many relationship with un/structuring rules, and to create un/structuring rules for models which you do not own and you cannot change.
(cattrs can be configured to use un/structuring rules from models using the
use_class_methods
strategy.) - Invent as little as possible; reuse existing ordinary Python instead.
For example, cattrs did not have a custom exception type to group exceptions until the sanctioned Python
exceptiongroups
. A side-effect of this design decision is that, in a lot of cases, when you're solving cattrs problems you're actually learning Python instead of learning cattrs. - Refuse the temptation to guess. If there are two ways of solving a problem, cattrs should refuse to guess and let the user configure it themselves.
A foolish consistency is the hobgoblin of little minds so these decisions can and are sometimes broken, but they have proven to be a good foundation.
Additional documentation and talks
- On structured and unstructured data, or the case for cattrs
- Why I use attrs instead of pydantic
- cattrs I: un/structuring speed
- Python has a macro language - it's Python (PyCon IT 2022)
- Intro to cattrs 23.1
Credits
Major credits to Hynek Schlawack for creating attrs and its predecessor, characteristic.
cattrs is tested with Hypothesis, by David R. MacIver.
cattrs is benchmarked using perf and pytest-benchmark.
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage
project template.