Clumper
A small python library that can clump lists of nested data together.
Part of a video series on calmcode.io.
Base Example
Clumper allows you to quickly parse through a list of json-like data.
Here's an example of such a dataset.
pokemon = [
{'name': 'Bulbasaur', 'type': ['Grass', 'Poison'], 'hp': 45, 'attack': 49},
{'name': 'Charmander', 'type': ['Fire'], 'hp': 39, 'attack': 52},
...
]
Given this list of dictionaries we can write the following query;
from clumper import Clumper
clump = Clumper.read_json('https://calmcode.io/datasets/pokemon.json')
(clump
.keep(lambda d: len(d['type']) == 1)
.mutate(type=lambda d: d['type'][0],
ratio=lambda d: d['attack']/d['hp'])
.select('name', 'type', 'ratio')
.sort(lambda d: d['ratio'], reverse=True)
.head(5)
.collect())
What this code does line-by-line.
This code will perform the following steps.- It imports
Clumper
. - It fetches a list of json-blobs about pokemon from the internet.
- It removes all the pokemon that have more than 1 type.
- The dictionaries that are left will have their
type
now as a string instead of a list of strings. - The dictionaries that are left will also have a property called
ratio
which calculates the ratio betweenhp
andattack
. - All the keys besides
name
,type
andratio
are removed. - The collection is sorted by
ratio
, from high to low. - We grab the top 5 after sorting.
- The results are returned as a list of dictionaries.
This is what we get back:
[{'name': 'Diglett', 'type': 'Ground', 'ratio': 5.5},
{'name': 'DeoxysAttack Forme', 'type': 'Psychic', 'ratio': 3.6},
{'name': 'Krabby', 'type': 'Water', 'ratio': 3.5},
{'name': 'DeoxysNormal Forme', 'type': 'Psychic', 'ratio': 3.0},
{'name': 'BanetteMega Banette', 'type': 'Ghost', 'ratio': 2.578125}]
Documentation
We've got a lovely documentation page that explains how the library works.
Features
- This library has no dependencies besides a modern version of python.
- The library offers a pattern of verbs that are very expressive.
- You can write code from top to bottom, left to right.
- You can read in many
json
/yaml
/csv
files by using a wildcard*
. - MIT License
Installation
You can install this package via pip
.
pip install clumper
It may be safer however to install via;
python -m pip install clumper
For details on why, check out this resource.
There are some optional dependencies that you might want to install as well.
python -m pip install clumper[yaml]
Contributing
Make sure you check out the issue list beforehand in order
to prevent double work before you make a pull request. To get started locally, you can clone
the repo and quickly get started using the Makefile
.
git clone [email protected]:koaning/clumper.git
cd clumper
make install-dev