• Stars
    star
    297
  • Rank 140,075 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

S'More speed for Marshmallow

🔥toastedmarshmallow🔥: Makes Marshmallow Toasty Fast

Toasted Marshmallow implements a JIT for marshmallow that speeds up dumping objects 10-25X (depending on your schema). Toasted Marshmallow allows you to have the great API that Marshmallow provides without having to sacrifice performance!

Benchmark Result:
  Original Time: 2682.61 usec/dump
  Optimized Time: 176.38 usec/dump
  Speed up: 15.21x

Even PyPy benefits from toastedmarshmallow!

Benchmark Result:
    Original Time: 189.78 usec/dump
    Optimized Time: 20.03 usec/dump
    Speed up: 9.48x

Installing toastedmarshmallow

pip install toastedmarshmallow

This will also install a slightly-forked marshmallow that includes some hooks Toastedmarshmallow needs enable the JIT to run before falling back to the original marshmallow code. These changes are minimal making it easier to track upstream. You can find the changes Here.

This means you should remove marshmallow from your requirements and replace it with toastedmarshmallow. By default there is no difference unless you explicitly enable Toasted Marshmallow.

Enabling Toasted Marshmallow

Enabling Toasted Marshmallow on an existing Schema is just one line of code, set the jit property on any Schema instance to toastedmarshmallow.Jit. For example:

from datetime import date
import toastedmarshmallow
from marshmallow import Schema, fields, pprint

class ArtistSchema(Schema):
    name = fields.Str()

class AlbumSchema(Schema):
    title = fields.Str()
    release_date = fields.Date()
    artist = fields.Nested(ArtistSchema())

schema = AlbumSchema()
# Specify the jit method as toastedmarshmallow's jit
schema.jit = toastedmarshmallow.Jit
# And that's it!  Your dump methods are 15x faster!

It's also possible to use the Meta class on the Marshmallow schema to specify all instances of a given Schema should be optimized:

import toastedmarshmallow
from marshmallow import Schema, fields, pprint

class ArtistSchema(Schema):
    class Meta:
        jit = toastedMarshmallow.Jit
    name = fields.Str()

You can also enable Toasted Marshmallow globally by setting the environment variable MARSHMALLOW_SCHEMA_DEFAULT_JIT to toastedmarshmallow.Jit . Future versions of Toasted Marshmallow may make this the default.

How it works

Toasted Marshmallow works by generating code at runtime to optimize dumping objects without going through layers and layers of reflection. The generated code optimistically assumes the objects being passed in are schematically valid, falling back to the original marshmallow code on failure.

For example, taking AlbumSchema from above, Toastedmarshmallow will generate the following 3 methods:

def InstanceSerializer(obj):
    res = {}
    value = obj.release_date; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
    value = obj.artist; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
    value = obj.title; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
    return res

def DictSerializer(obj):
    res = {}
    if "release_date" in obj:
        value = obj["release_date"]; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
    if "artist" in obj:
        value = obj["artist"]; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
    if "title" in obj:
        value = obj["title"]; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
    return res

def HybridSerializer(obj):
    res = {}
    try:
        value = obj["release_date"]
    except (KeyError, AttributeError, IndexError, TypeError):
        value = obj.release_date
    value = value; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
    try:
        value = obj["artist"]
    except (KeyError, AttributeError, IndexError, TypeError):
        value = obj.artist
    value = value; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
    try:
        value = obj["title"]
    except (KeyError, AttributeError, IndexError, TypeError):
        value = obj.title
    value = value; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
    return res

Toastedmarshmallow will invoke the proper serializer based upon the input.

Since Toastedmarshmallow is generating code at runtime, it's critical you re-use Schema objects. If you're creating a new Schema object every time you serialize/deserialize an object you'll likely have much worse performance.

More Repositories

1

cartography

Cartography is a Python tool that consolidates infrastructure assets and the relationships between them in an intuitive graph view powered by a Neo4j database.
Python
2,949
star
2

scissors

✂ Android image cropping library
Java
1,841
star
3

confidant

Confidant: your secret keeper. https://lyft.github.io/confidant
Python
1,781
star
4

clutch

Extensible platform for infrastructure management
Go
1,670
star
5

react-javascript-to-typescript-transform

Convert React JavaScript code to TypeScript with proper typing
TypeScript
1,574
star
6

mapper

A JSON deserialization library for Swift
Swift
1,174
star
7

scoop

🍦 micro framework for building view based modular Android applications.
Java
1,036
star
8

Hammer

iOS touch synthesis library
Swift
642
star
9

protoc-gen-star

protoc plugin library for efficient proto-based code generation
Go
563
star
10

xiblint

A tool for linting storyboard and xib files
Python
522
star
11

flinkk8soperator

Kubernetes operator that provides control plane for managing Apache Flink applications
Go
508
star
12

domic

Reactive Virtual DOM for Android.
Kotlin
482
star
13

metadataproxy

A proxy for AWS's metadata service that gives out scoped IAM credentials from STS
Python
450
star
14

cni-ipvlan-vpc-k8s

AWS VPC Kubernetes CNI driver using IPvlan
Go
357
star
15

nuscenes-devkit

Devkit for the public 2019 Lyft Level 5 AV Dataset (fork of https://github.com/nutonomy/nuscenes-devkit)
Jupyter Notebook
352
star
16

presto-gateway

A load balancer / proxy / gateway for prestodb
JavaScript
337
star
17

Kronos-Android

An Open Source Kotlin SNTP library
Kotlin
239
star
18

coloralgorithm

Javacript function to produce color sets
JavaScript
225
star
19

awspricing

Python library for AWS pricing.
Python
201
star
20

discovery

This service provides a REST interface for querying for the list of hosts that belong to all microservices.
Python
185
star
21

linty_fresh

✨ Surface lint errors during code review
Python
183
star
22

universal-async-component

React Universal Async Component that works with server side rendering
TypeScript
180
star
23

python-blessclient

Python client for fetching BLESS certificates
Python
112
star
24

goruntime

Go client for Runtime application level feature flags and configuration
Go
84
star
25

omnibot

One slackbot to rule them all
Python
80
star
26

lyft-android-sdk

Public Lyft SDK for Android
Java
72
star
27

high-entropy-string

A library for classifying strings as potential secrets.
Python
60
star
28

gostats

Go client for Stats
Go
56
star
29

pynamodb-attributes

Common attributes for PynamoDB
Python
52
star
30

bandit-high-entropy-string

A high entropy string plugin for OpenStack's bandit project
Python
48
star
31

Lyft-iOS-sdk

Public Lyft SDK for iOS
Swift
43
star
32

python-kmsauth

A python library for reusing KMS for your own authentication and authorization
Python
38
star
33

opsreview

Compile a report of recent PagerDuty alerts for a single escalation policy.
Python
29
star
34

atlantis

Terraform automation for GitHub PRs (private fork of runatlantis/atlantis)
Go
18
star
35

lyft-node-sdk

Node SDK for the Lyft Public API
JavaScript
16
star
36

fake_sqs

An implementation of a local SQS service.
Ruby
15
star
37

lyft.github.io

This is code for oss.lyft.com website.
TypeScript
14
star
38

dockernetes

Run kubernetes inside a docker container.
Dockerfile
12
star
39

kustomizer

A container for running k8s kustomize
Shell
11
star
40

python-confidant-client

Client library and CLI for Confidant
Python
11
star
41

collectd-statsd

collectd plugin to write to statsd
Python
10
star
42

lyft-web-button

Build an actionable, Lyft-branded button for your website
JavaScript
9
star
43

dynamodb-hive-serde

Hive Deserializer for DynamoDB backup data format
Java
8
star
44

syx

Python 2 and 3 compatibility library from Lyft.
Python
7
star
45

lyft-node-samples

Sample applications using Node.js for the Lyft Public API
JavaScript
7
star
46

android-puzzlers

Android puzzles for Lyft Talks and more.
Java
6
star
47

lyft-go-sdk

Go SDK for the Lyft Public API
Go
6
star
48

lyft-django-sample

An API integration example using Django and social-auth.
Python
5
star
49

python-omnibot-receiver

Library for use by services that receive messages from omnibot.
Python
4
star
50

osscla

Open Source Contributor License Agreement service
Python
4
star
51

code-of-conduct

Code of Conduct for Lyft's open source projects
3
star
52

CLA

Contributor License Agreement (CLA) for Lyft's open source projects
3
star
53

awseipext

AWS Lambda that extends the EC2 Elastic IP API
Python
3
star
54

heroku-buildpack-php

Shell
2
star
55

lyft-go-samples

Sample applications in Go for the Lyft Public API
Go
1
star
56

flask-pystatsd

flask extension to simplify the use of the pystatsd library
Python
1
star
57

eventbot

A slackbot to help organize events
Python
1
star