• Stars
    star
    250
  • Rank 162,397 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 14 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast Protocol Buffers in python (by using the C++ API)

fast-python-pb: Fast Python Protocol Buffers

Thin wrapper on top of the C++ protocol buffer implementation resulting in significantly faster protocol buffers in Python.

Why:

We wanted a fast implementation of protocol buffers that still felt like Python, hence this implementation.

For our use case, this module is up to 15 times faster than the standard one and 10 times as fast as Python's json serializer.

Status:

This is a very early stage project. It works for our needs. We haven't verified it works beyond that. Issue reports and patches are very much appreciated!

For example, it only supports strint, int32, int64, double, and sub message members at this time.

Pre-requisites:

Install protocol buffers

Installation:

git clone https://github.com/Cue/fast-python-pb.git

cd fast-python-pb

python setup.py install

Usage:

protoc --fastpython_out /output/path --cpp_out /output/path --proto_path your/path your/path/file.proto

Example:

You can see the example in action in the benchmark directory.

// person.proto
package person_proto;

message Fact {
  required string name = 1;

  required string content = 2;
}

message Person {
  required string name = 1;

  required int32 birth_year = 2;

  repeated string nicknames = 3;

  repeated Fact facts = 4;
}
# example.py
import person_proto

lincoln = person_proto.Person(name = 'Abraham Lincoln', birth_year = 1809)
lincoln.nicknames = ['Honest Abe', 'Abe']
lincoln.facts = [
    person_proto.Fact(name = 'Born In', content = 'Kentucky'),
    person_proto.Fact(name = 'Died In', content = 'Washington D.C.'),
    person_proto.Fact(name = 'Greatest Speech', content = GETTYSBURG)
]

serializedLincoln = lincoln.SerializeToString()

newLincoln = person_proto.Person()
newLincoln.ParseFromString(serializedLincoln)

The package definition is mandatory; it determines the Python module name that the code will generate. If it has dots for namespacing, like com.cueup.foo, the last part of the name (foo) will be used for the Python module name.

One more thing

It's simple, but not that simple. The biggest caveat is that protobuf objects embedded in other protobuf objects are mutable, but all changes to them are discarded. If you want to build a protobuf with other protobufs in it, build them separately. To illustrate:

import addressbook_proto

entry = addressbook_proto.Entry(name='Gillian Baskin')
entry.birthplace = addressbook_proto.Location(state='Minnesota', town='Duluth')

# Now, to modify it. Don't do this:
entry.birthplace.town = 'New Town'
# Instead, do this:
birthplace = entry.birthplace
birthplace.town = 'New Town'
entry.birthplace = birthplace

There are also several methods for serializing and deserializing. Here's a list:

ParseFromString(str) parses from a serialized protobuf stream.

ParseFromLongString(str) has the same effect as ParseFromString(str), but is faster for long strings and slower for short ones. This isn't a huge difference, but could be important if you're dealing with very large protobufs.

SerializeToString() returns the serialized form of the protobuf, as a string.

SerializeMany(protobufs) takes a sequence of protobuf objects and serializes them to a single string. The length of each protobuf is marked, so this can be serialized back to a list of protobufs.

ParseMany(str, callback) takes a string in the format produced by SerializeMany, and calls callback with each protobuf object, in order. You can use this to build a list of protobufs like this:

people = []
addressbook_proto.Person.ParseMany(serializedPeople, people.append)
print people  # Will be a list of Person protobuf objects

Authors:

Greplin, Inc.

Alan Grow

Oliver Tonnhofer

Joe Shaw

More Repositories

1

scales

scales - Metrics for Python
Python
920
star
2

hookshot

Instrumentation for Objective C for debugging and profiling
Objective-C
392
star
3

ocstyle

Objective-C style checker
Python
255
star
4

TheKitchenSync

A Tool Belt for iOS Concurrency
Objective-C
197
star
5

greplin-bloom-filter

Java implementation of a probabilistic set data structure
Java
143
star
6

hop

Script to hop to common directories and servers
Lua
112
star
7

greplin-lucene-utils

Some utilities for Lucene
Java
111
star
8

CueTableReloader

A really handy class that automatically figures out insertions, deletions, and reloads in UITableView based on unique item keys.
Objective-C
92
star
9

greplin-exception-catcher

Exception catcher that runs on Google App Engine
Python
74
star
10

greplin-tornado-ses

An asynchronous client for Amazon SES
Python
42
star
11

greplin-zookeeper-utils

Utilities for dealing with Apache Zookeeper
Java
41
star
12

greplin-nagios-utils

Utilities for monitoring with Nagios
Python
39
star
13

lucene-interval-fields

Lucene fields and queries for interval fields.
Java
37
star
14

greplin-tornado-sendgrid

A client for the Sendgrid API
Python
32
star
15

greplin-twisted-utils

Utilities for working with Twisted
Python
28
star
16

qc

QuickCheck for Python
Python
27
star
17

greplin-tornado-stripe

Tornado bindings for Stripe's API
Python
27
star
18

polarbear

OOM diagnostics for Java.
C++
21
star
19

greplin-tornado-mixpanel

A client for the Mixpanel API
Python
20
star
20

greplin-tornado-kissmetrics

A client for the Kissmetrics API
Python
17
star
21

evernote-python-api

Packaged version of latest Evernote Python API
Python
14
star
22

htmltotext

Fork of flaxcode htmltotext module
C++
13
star
23

jsnappy

Java implementation of the Snappy compression/decompression algorithm from Google.
Java
11
star
24

phpserialize

PHP style serialize and unserialize in Python
Python
6
star
25

hegemon

(java)script utilities
Java
5
star
26

skia

Fork of Google's Skia library
C++
4
star
27

pyiso8601

Forked version of pyiso8601 - http://code.google.com/p/pyiso8601
Python
3
star
28

eventlet

Fork of eventlet with patches from Greplin
Python
3
star
29

greplin-vobject

Greplin fork of vobject
Python
2
star
30

hegemon-example

Java
1
star