• Stars
    star
    225
  • Rank 177,187 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Provides utilities to facilitate the usage of Zipkin in Python

Build Status Coverage Status PyPi version Supported Python versions

py_zipkin

py_zipkin provides a context manager/decorator along with some utilities to facilitate the usage of Zipkin in Python applications.

Install

pip install py_zipkin

Usage

py_zipkin requires a transport_handler object that handles logging zipkin messages to a central logging service such as kafka or scribe.

py_zipkin.zipkin.zipkin_span is the main tool for starting zipkin traces or logging spans inside an ongoing trace. zipkin_span can be used as a context manager or a decorator.

Usage #1: Start a trace with a given sampling rate

from py_zipkin.zipkin import zipkin_span

def some_function(a, b):
    with zipkin_span(
        service_name='my_service',
        span_name='my_span_name',
        transport_handler=some_handler,
        port=42,
        sample_rate=0.05, # Value between 0.0 and 100.0
    ):
        do_stuff(a, b)

Usage #2: Trace a service call

The difference between this and Usage #1 is that the zipkin_attrs are calculated separately and passed in, thus negating the need of the sample_rate param.

# Define a pyramid tween
def tween(request):
    zipkin_attrs = some_zipkin_attr_creator(request)
    with zipkin_span(
        service_name='my_service',
        span_name='my_span_name',
        zipkin_attrs=zipkin_attrs,
        transport_handler=some_handler,
        port=22,
    ) as zipkin_context:
        response = handler(request)
        zipkin_context.update_binary_annotations(
            some_binary_annotations)
        return response

Usage #3: Log a span inside an ongoing trace

This can be also be used inside itself to produce continuously nested spans.

@zipkin_span(service_name='my_service', span_name='some_function')
def some_function(a, b):
    return do_stuff(a, b)

Other utilities

zipkin_span.update_binary_annotations() can be used inside a zipkin trace to add to the existing set of binary annotations.

def some_function(a, b):
    with zipkin_span(
        service_name='my_service',
        span_name='some_function',
        transport_handler=some_handler,
        port=42,
        sample_rate=0.05,
    ) as zipkin_context:
        result = do_stuff(a, b)
        zipkin_context.update_binary_annotations({'result': result})

zipkin_span.add_sa_binary_annotation() can be used to add a binary annotation to the current span with the key 'sa'. This function allows the user to specify the destination address of the service being called (useful if the destination doesn't support zipkin). See http://zipkin.io/pages/data_model.html for more information on the 'sa' binary annotation.

NOTE: the V2 span format only support 1 "sa" endpoint (represented by remoteEndpoint) so add_sa_binary_annotation now raises ValueError if you try to set multiple "sa" annotations for the same span.

def some_function():
    with zipkin_span(
        service_name='my_service',
        span_name='some_function',
        transport_handler=some_handler,
        port=42,
        sample_rate=0.05,
    ) as zipkin_context:
        make_call_to_non_instrumented_service()
        zipkin_context.add_sa_binary_annotation(
            port=123,
            service_name='non_instrumented_service',
            host='12.34.56.78',
        )

create_http_headers_for_new_span() creates a set of HTTP headers that can be forwarded in a request to another service.

headers = {}
headers.update(create_http_headers_for_new_span())
http_client.get(
    path='some_url',
    headers=headers,
)

Transport

py_zipkin (for the moment) thrift-encodes spans. The actual transport layer is pluggable, though.

The recommended way to implement a new transport handler is to subclass py_zipkin.transport.BaseTransportHandler and implement the send and get_max_payload_bytes methods.

send receives an already encoded thrift list as argument. get_max_payload_bytes should return the maximum payload size supported by your transport, or None if you can send arbitrarily big messages.

The simplest way to get spans to the collector is via HTTP POST. Here's an example of a simple HTTP transport using the requests library. This assumes your Zipkin collector is running at localhost:9411.

NOTE: older versions of py_zipkin suggested implementing the transport handler as a function with a single argument. That's still supported and should work with the current py_zipkin version, but it's deprecated.

import requests

from py_zipkin.transport import BaseTransportHandler


class HttpTransport(BaseTransportHandler):

    def get_max_payload_bytes(self):
        return None

    def send(self, encoded_span):
        # The collector expects a thrift-encoded list of spans.
        requests.post(
            'http://localhost:9411/api/v1/spans',
            data=encoded_span,
            headers={'Content-Type': 'application/x-thrift'},
        )

If you have the ability to send spans over Kafka (more like what you might do in production), you'd do something like the following, using the kafka-python package:

from kafka import SimpleProducer, KafkaClient

from py_zipkin.transport import BaseTransportHandler


class KafkaTransport(BaseTransportHandler):

    def get_max_payload_bytes(self):
        # By default Kafka rejects messages bigger than 1000012 bytes.
        return 1000012

    def send(self, message):
        kafka_client = KafkaClient('{}:{}'.format('localhost', 9092))
        producer = SimpleProducer(kafka_client)
        producer.send_messages('kafka_topic_name', message)

Using in multithreading environments

If you want to use py_zipkin in a cooperative multithreading environment, e.g. asyncio, you need to explicitly pass an instance of py_zipkin.storage.Stack as parameter context_stack for zipkin_span and create_http_headers_for_new_span. By default, py_zipkin uses a thread local storage for the attributes, which is defined in py_zipkin.storage.ThreadLocalStack.

Additionally, you'll also need to explicitly pass an instance of py_zipkin.storage.SpanStorage as parameter span_storage to zipkin_span.

from py_zipkin.zipkin import zipkin_span
from py_zipkin.storage import Stack
from py_zipkin.storage import SpanStorage


def my_function():
    context_stack = Stack()
    span_storage = SpanStorage()
    await my_function(context_stack, span_storage)

async def my_function(context_stack, span_storage):
    with zipkin_span(
        service_name='my_service',
        span_name='some_function',
        transport_handler=some_handler,
        port=42,
        sample_rate=0.05,
        context_stack=context_stack,
        span_storage=span_storage,
    ):
        result = do_stuff(a, b)

Firehose mode [EXPERIMENTAL]

"Firehose mode" records 100% of the spans, regardless of sampling rate. This is useful if you want to treat these spans differently, e.g. send them to a different backend that has limited retention. It works in tandem with normal operation, however there may be additional overhead. In order to use this, you add a firehose_handler just like you add a transport_handler.

This feature should be considered experimental and may be removed at any time without warning. If you do use this, be sure to send asynchronously to avoid excess overhead for every request.

License

Copyright (c) 2018, Yelp, Inc. All Rights reserved. Apache v2

More Repositories

1

elastalert

Easy & Flexible Alerting With ElasticSearch
Python
7,926
star
2

dumb-init

A minimal init system for Linux containers
Python
6,806
star
3

detect-secrets

An enterprise friendly way of detecting and preventing secrets in code.
Python
3,704
star
4

mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
Python
2,615
star
5

osxcollector

A forensic evidence collection & analysis toolkit for OS X
Python
1,858
star
6

paasta

An open, distributed platform as a service
Python
1,681
star
7

undebt

A fast, straightforward, reliable tool for performing massive, automated code refactoring
Python
1,634
star
8

MOE

A global, black box optimization engine for real world metric optimization.
C++
1,306
star
9

dockersh

A shell which places users into individual docker containers
Go
1,282
star
10

dataset-examples

Samples for users of the Yelp Academic Dataset
Python
1,189
star
11

yelp.github.io

A showcase of projects we've open sourced and open source projects we use
JavaScript
701
star
12

bravado

Bravado is a python client library for Swagger 2.0 services
Python
603
star
13

yelp-api

Examples of code using our v2 API
PHP
580
star
14

service-principles

A guide to service principles at Yelp for our service oriented architecture
423
star
15

swagger-gradle-codegen

💫 A Gradle Plugin to generate your networking code from Swagger
Kotlin
413
star
16

mysql_streamer

MySQLStreamer is a database change data capture and publish system.
Python
409
star
17

pyleus

Pyleus is a Python framework for developing and launching Storm topologies.
Python
406
star
18

yelp-fusion

Yelp Fusion API
Python
401
star
19

docker-custodian

Keep docker hosts tidy
Python
355
star
20

android-school

The best videos from the Android community and beyond
350
star
21

Tron

Next generation batch process scheduling and management
Python
340
star
22

kafka-utils

Python
313
star
23

bento

DEPRECATED - A delicious framework for building modularized Android user interfaces, by Yelp.
Kotlin
306
star
24

Testify

A more pythonic testing framework.
Python
303
star
25

clusterman

Cluster Autoscaler for Kubernetes and Mesos
Python
295
star
26

kotlin-android-workshop

A Kotlin Workshop for engineers familiar with Java and Android development.
Kotlin
288
star
27

threat_intel

Threat Intelligence APIs
Python
264
star
28

nrtsearch

A high performance gRPC server on top of Apache Lucene
Java
254
star
29

python-gearman

Gearman API - Client, worker, and admin client interfaces
Python
242
star
30

fuzz-lightyear

A pytest-inspired, DAST framework, capable of identifying vulnerabilities in a distributed, micro-service ecosystem through chaos engineering testing and stateful, Swagger fuzzing.
Python
205
star
31

yelp-python

A Python library for the Yelp API
Python
182
star
32

venv-update

Synchronize your virtualenv quickly and exactly.
Python
178
star
33

firefly

Firefly is a web application aimed at powerful, flexible time series graphing for web developers.
JavaScript
171
star
34

amira

AMIRA: Automated Malware Incident Response & Analysis
Python
150
star
35

aactivator

Automatically source and unsource a project's environment
Python
145
star
36

YLTableView

Objective-C
144
star
37

love

A system to share your appreciation
Python
142
star
38

lemon-reset

Consistent, cross-browser React DOM tags, powered by CSS Modules. 🍋
JavaScript
131
star
39

dataloader-codegen

🤖 dataloader-codegen is an opinionated JavaScript library for automatically generating DataLoaders over a set of resources (e.g. HTTP endpoints).
TypeScript
110
star
40

bravado-core

Python
109
star
41

data_pipeline

Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.
Python
109
star
42

detect-secrets-server

Python
108
star
43

yelp-ruby

A Ruby gem for communicating with the Yelp REST API
Ruby
105
star
44

swagger_spec_validator

Python
104
star
45

ybinlogp

A fast mysql binlog parser
C
97
star
46

beans

Bringing people together, one cup of coffee at a time
Python
93
star
47

casper

A fast web application platform built in Rust and Luau
Rust
90
star
48

schematizer

A schema store service that tracks and manages all the schemas used in the Data Pipeline
Python
86
star
49

requirements-tools

requirements-tools contains scripts for working with Python requirements, primarily in applications.
Python
81
star
50

osxcollector_output_filters

Filters that process and transform the output of osxcollector
Python
77
star
51

sensu_handlers

Custom Sensu Handlers to support a multi-tenant environment, allowing checks themselves to emit the type of handler behavior they need in the event json
Ruby
75
star
52

graphql-guidelines

GraphQL @ Yelp Schema Guidelines
Makefile
74
star
53

kegmate

Arduino/iPad powered kegerator
Objective-C
72
star
54

ephemeral-port-reserve

Find an unused port, reliably
Python
68
star
55

parcelgen

Helpful tool to make data objects easier for Android
Python
65
star
56

salsa

A tool for exporting iOS components into Sketch 📱💎
Swift
62
star
57

yelp-ios

Objective-C
61
star
58

docker-observium

Observium docker image with both professional and community edition support, ldap auth, and easy plugin support.
ApacheConf
58
star
59

yelp-android

Java
55
star
60

terraform-provider-signalform

SignalForm is a terraform provider to codify SignalFx detectors, charts and dashboards
Go
44
star
61

mycroft

Python
42
star
62

pidtree-bcc

eBPF tool for logging process ancestry of outbound TCP connections
Python
41
star
63

terraform-provider-gitfile

Terraform provider for checking out git repositories and making changes
Go
40
star
64

ffmpeg-android

Shell
39
star
65

pushmanager

Pushmanager is a web application to manage source code deployments.
Python
38
star
66

zygote

A Python HTTP process management utility.
Python
38
star
67

yelp_kafka

An extension of the kafka-python package that adds features like multiprocess consumers.
Python
38
star
68

pgctl

Manage sets of developer services -- "playground control"
Python
31
star
69

EMRio

Elastic MapReduce instance optimizer
Python
31
star
70

s3mysqldump

Dump mysql tables to s3, and parse them
Python
31
star
71

android-varanus

A client-side Android library to monitor and limit network traffic sent by your apps
Kotlin
29
star
72

pyramid_zipkin

Pyramid tween to add Zipkin service spans
Python
29
star
73

puppet-netstdlib

A collection of Puppet functions for interacting with the network
Ruby
27
star
74

sqlite3dbm

sqlite-backed dictionary conforming to the dbm interface
Python
27
star
75

send_nsca

Pure-python NSCA client
Python
26
star
76

docker-push-latest-if-changed

Python
26
star
77

data_pipeline_avro_util

Provides a Pythonic interface for reading and writing Avro schemas
Python
26
star
78

cocoapods-readonly

Automatically locks all CocoaPod source files.
Ruby
26
star
79

uwsgi_metrics

Python
26
star
80

WebImageView

An enhanced and improved ImageView for Android that displays images loaded over the interwebs
Java
25
star
81

task_processing

Interfaces and shared infrastructure for generic task processing at Yelp.
Python
23
star
82

PushmasterApp

(Legacy) Yelp pushmaster application built on Google App Engine
Python
22
star
83

tlspretense-service

A Docker container that exposes tlspretense on a port.
Makefile
20
star
84

puppet-uchiwa

Puppet module for installing Uchiwa
Ruby
20
star
85

yelp_cheetah

cheetah, hacked by yelpers
Python
20
star
86

logfeeder

Python
20
star
87

fido

Asynchronous HTTP client built on top of Crochet and Twisted
Python
20
star
88

swagger-spec-compatibility

Python library to check Swagger Spec backward compatibility
Python
20
star
89

pyramid-hypernova

A Python client for Airbnb's Hypernova server, for use with the Pyramid web framework.
Python
19
star
90

mr3po

protocols for use with mrjob
Python
16
star
91

YPFastDateParser

A class for parsing strings into NSDate instances, several times faster than NSDateFormatter
Objective-C
15
star
92

yelp_uri

Utilities for dealing with URIs, invented and maintained by Yelp.
Python
14
star
93

pysensu-yelp

A Python library to emit Sensu events that the Yelp Sensu Handlers can understand for Self-Service Sensu Monitoring
Python
14
star
94

terraform-provider-cloudhealth

Terraform provider for Cloudhealth
Go
14
star
95

yelp-rails-example

An example Rails application that uses the Yelp gem to integrate with the API
Ruby
13
star
96

named_decorator

Dynamically name wrappers based on their callees to untangle profiles of large python codebases
Python
12
star
97

pt-online-schema-change-plugins

Perl
11
star
98

environment_tools

Tools for programmatically describing Yelp's different environments (prod, dev, stage)
Python
11
star
99

puppet-cron

A super great cron Puppet module with timeouts, locking, monitoring, and more!
Ruby
11
star
100

pyswf

Python
10
star