• Stars
    star
    911
  • Rank 50,145 (Top 1.0 %)
  • Language
    Go
  • License
    MIT License
  • Created almost 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Go package containing implementations of efficient encoding, decoding, and validation APIs.

encoding build status Go Report Card GoDoc

Go package containing implementations of encoders and decoders for various data formats.

Motivation

At Segment, we do a lot of marshaling and unmarshaling of data when sending, queuing, or storing messages. The resources we need to provision on the infrastructure are directly related to the type and amount of data that we are processing. At the scale we operate at, the tools we choose to build programs can have a large impact on the efficiency of our systems. It is important to explore alternative approaches when we reach the limits of the code we use.

This repository includes experiments for Go packages for marshaling and unmarshaling data in various formats. While the focus is on providing a high performance library, we also aim for very low development and maintenance overhead by implementing APIs that can be used as drop-in replacements for the default solutions.

Requirements and Maintenance Schedule

This package has no dependencies outside of the core runtime of Go. It requires a recent version of Go.

This package follows the same maintenance schedule as the Go project, meaning that issues relating to versions of Go which aren't supported by the Go team, or versions of this package which are older than 1 year, are unlikely to be considered.

Additionally, we have fuzz tests which aren't a runtime required dependency but will be pulled in when running go mod tidy. Please don't include these go.mod updates in change requests.

encoding/json GoDoc

More details about how this package achieves a lower CPU and memory footprint can be found in the package README.

The json sub-package provides a re-implementation of the functionalities offered by the standard library's encoding/json package, with a focus on lowering the CPU and memory footprint of the code.

The exported API of this package mirrors the standard library's encoding/json package, the only change needed to take advantage of the performance improvements is the import path of the json package, from:

import (
    "encoding/json"
)

to

import (
    "github.com/segmentio/encoding/json"
)

The improvement can be significant for code that heavily relies on serializing and deserializing JSON payloads. The CI pipeline runs benchmarks to compare the performance of the package with the standard library and other popular alternatives; here's an overview of the results:

Comparing to encoding/json (v1.16.2)

name                           old time/op    new time/op     delta
Marshal/*json.codeResponse2      6.40ms ± 2%     3.82ms ± 1%   -40.29%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2    28.1ms ± 3%      5.6ms ± 3%   -80.21%  (p=0.008 n=5+5)

name                           old speed      new speed       delta
Marshal/*json.codeResponse2     303MB/s ± 2%    507MB/s ± 1%   +67.47%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2  69.2MB/s ± 3%  349.6MB/s ± 3%  +405.42%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op    delta
Marshal/*json.codeResponse2       0.00B           0.00B           ~     (all equal)
Unmarshal/*json.codeResponse2    1.80MB ± 1%     0.02MB ± 0%   -99.14%  (p=0.016 n=5+4)

name                           old allocs/op  new allocs/op   delta
Marshal/*json.codeResponse2        0.00            0.00           ~     (all equal)
Unmarshal/*json.codeResponse2     76.6k ± 0%       0.1k ± 3%   -99.92%  (p=0.008 n=5+5)

Benchmarks were run on a Core i9-8950HK CPU @ 2.90GHz.

Comparing to github.com/json-iterator/go (v1.1.10)

name                           old time/op    new time/op    delta
Marshal/*json.codeResponse2      6.19ms ± 3%    3.82ms ± 1%   -38.26%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2    8.52ms ± 3%    5.55ms ± 3%   -34.84%  (p=0.008 n=5+5)

name                           old speed      new speed      delta
Marshal/*json.codeResponse2     313MB/s ± 3%   507MB/s ± 1%   +61.91%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2   228MB/s ± 3%   350MB/s ± 3%   +53.50%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op   delta
Marshal/*json.codeResponse2       8.00B ± 0%     0.00B       -100.00%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2    1.05MB ± 0%    0.02MB ± 0%   -98.53%  (p=0.000 n=5+4)

name                           old allocs/op  new allocs/op  delta
Marshal/*json.codeResponse2        1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
Unmarshal/*json.codeResponse2     37.2k ± 0%      0.1k ± 3%   -99.83%  (p=0.008 n=5+5)

Although this package aims to be a drop-in replacement of encoding/json, it does not guarantee the same error messages. It will error in the same cases as the standard library, but the exact error message may be different.

encoding/iso8601 GoDoc

The iso8601 sub-package exposes APIs to efficiently deal with with string representations of iso8601 dates.

Data formats like JSON have no syntaxes to represent dates, they are usually serialized and represented as a string value. In our experience, we often have to check whether a string value looks like a date, and either construct a time.Time by parsing it or simply treat it as a string. This check can be done by attempting to parse the value, and if it fails fallback to using the raw string. Unfortunately, while the happy path for time.Parse is fairly efficient, constructing errors is much slower and has a much bigger memory footprint.

We've developed fast iso8601 validation functions that cause no heap allocations to remediate this problem. We added a validation step to determine whether the value is a date representation or a simple string. This reduced CPU and memory usage by 5% in some programs that were doing time.Parse calls on very hot code paths.

More Repositories

1

evergreen

🌲 Evergreen React UI Framework by Segment
JavaScript
12,161
star
2

kafka-go

Kafka library in Go
Go
7,518
star
3

analytics.js

The hassle-free way to integrate analytics into any web application.
JavaScript
4,775
star
4

myth

A CSS preprocessor that acts like a polyfill for future versions of the spec.
JavaScript
4,345
star
5

ksuid

K-Sortable Globally Unique IDs
Go
4,121
star
6

daydream

A chrome extension to record your actions into a nightmare or puppeteer script
JavaScript
2,768
star
7

chamber

CLI for managing secrets
Go
2,283
star
8

stack

A set of Terraform modules for configuring production infrastructure with AWS
HCL
2,098
star
9

ui-box

Blazing Fast React UI Primitive
TypeScript
1,052
star
10

golines

A golang formatter that fixes long lines
Go
803
star
11

asm

Go library providing algorithms optimized to leverage the characteristics of modern CPUs
Go
795
star
12

analytics-node

The hassle-free way to integrate analytics into any node application.
JavaScript
593
star
13

topicctl

Tool for declarative management of Kafka topics
Go
558
star
14

aws-okta

aws-vault like tool for Okta authentication
Go
541
star
15

niffy

Perceptual diffing suite built on Nightmare
JavaScript
535
star
16

analytics-ios

The hassle-free way to integrate analytics into any iOS application.
Objective-C
388
star
17

analytics-ruby

The hassle-free way to integrate analytics into any Ruby application.
Ruby
374
star
18

analytics-android

The hassle-free way to add analytics to your Android app.
Java
373
star
19

analytics-react-native

The hassle-free way to add analytics to your React-Native app.
TypeScript
337
star
20

consent-manager

Drop-in consent management plugin for analytics.js
TypeScript
326
star
21

parquet-go

Go library to read/write Parquet files
Go
314
star
22

ts-mysql-plugin

A typescript language service plugin that gives superpowers to SQL tagged template literals.
TypeScript
312
star
23

analytics-next

Segment Analytics.js 2.0
TypeScript
294
star
24

specs

Peer into your ECS clusters
JavaScript
273
star
25

fasthash

Go package porting the standard hashing algorithms to a more efficient implementation.
Go
261
star
26

ctlstore

Control Data Store
Go
261
star
27

ware

Easily create your own middleware layer.
JavaScript
254
star
28

analytics-php

The hassle-free way to integrate analytics into any php application.
PHP
252
star
29

analytics-python

The hassle-free way to integrate analytics into any python application.
Python
231
star
30

chrome-sidebar

Easiest way to embed an iframe as a chrome extension
JavaScript
208
star
31

typewriter

Type safety + intellisense for your Segment analytics
TypeScript
206
star
32

nsq.js

NSQ client for nodejs
JavaScript
203
star
33

stats

Go package for abstracting stats collection
Go
202
star
34

threat-modeling-training

Segment's Threat Modeling training for our engineers
197
star
35

in-eu

🇪🇺 privacy first EU detection library for browsers
JavaScript
180
star
36

kubectl-curl

Kubectl plugin to run curl commands against kubernetes pods
Go
167
star
37

go-prompt

Go terminal prompts.
Go
167
star
38

analytics-react

[DEPRECATED AND UNSUPPORTED] The hassle-free way to integrate analytics into your React application.
JavaScript
160
star
39

is-url

Loosely validate a URL.
JavaScript
160
star
40

cwlogs

CLI tool for reading logs from Cloudwatch Logs
Go
142
star
41

kubeapply

A lightweight tool for git-based management of Kubernetes configs
Go
141
star
42

analytics-go

Segment analytics client for Go
Go
136
star
43

analytics.js-core

The hassle-free way to integrate analytics into any web application.
TypeScript
132
star
44

dependency-report

Generate usage reports of your JS dependencies
JavaScript
129
star
45

ecs-logs

Log forwarder for services ran by ecs-agent.
Go
115
star
46

analytics-java

The hassle-free way to integrate analytics into any java application.
Java
113
star
47

analytics.js-integrations

Monorepo housing Segment's analytics.js integrations
JavaScript
112
star
48

go-athena

Golang database/sql driver for AWS Athena
Go
107
star
49

Analytics.NET

The hassle-free way to integrate analytics into any C# / .NET application.
C#
107
star
50

go-queue

NSQ consumer convenience layer.
Go
104
star
51

analytics-swift

The hassle-free way to add Segment analytics to your Swift app (iOS/tvOS/watchOS/macOS/Linux).
Swift
102
star
52

xml-parser

simple non-compliant xml parser for nodejs
JavaScript
101
star
53

backo

exponential backoff without the weird cruft
JavaScript
99
star
54

analytics-vue

The hassle-free way to integrate analytics into your Vue application.
Vue
98
star
55

nsq-go

Go package providing tools for building NSQ clients, servers and middleware.
Go
94
star
56

consul-go

Go package providing building blocks for interacting with Consul.
Go
90
star
57

frictionless-signup

Reduce friction and increase customer data in your online forms using Segment & Clearbit
JavaScript
86
star
58

superagent-retry

Retry superagent requests for common hangups
JavaScript
85
star
59

pg-escape

sprintf-style postgres query escaping and helper functions
JavaScript
84
star
60

conf

Go package for loading program configuration from multiple sources.
Go
81
star
61

orbital

🚀🌏 A simple end-to-end testing framework for Go
Go
80
star
62

functions-library

A library of example functions to use with the Segment Developer Center
JavaScript
75
star
63

inbound

A url and referrer parsing library for node.
JavaScript
72
star
64

decibel

A small iOS app for recording office noise dB levels to Datadog.
Swift
69
star
65

analytics-angular

The hassle-free way to integrate analytics into your Angular application.
TypeScript
68
star
66

events

Go package for routing, formatting and publishing events produced by a program.
Go
62
star
67

glue

Generate typed Golang RPC clients from server code
Go
60
star
68

pingdummy

Example application for segmentio/stack
JavaScript
60
star
69

go-loggly

Loggly client for Go
Go
59
star
70

analytics-rust

Segment analytics client for Rust
Rust
55
star
71

retrofit-jsonrpc

Json-RPC with Retrofit.
Java
54
star
72

snippet

Render the analytics.js snippet.
JavaScript
53
star
73

nsq_to_redis

NSQ ✈ Redis {pubsub, capped lists}
Go
52
star
74

segment-proxy

Proxies requests to the Segment CDN and Tracking API.
Go
51
star
75

statsy

Simple statsd client for nodejs
JavaScript
49
star
76

sherlock

A pluggable service-detection tool
JavaScript
49
star
77

is-email

Component: loosely validate an email address.
JavaScript
49
star
78

objconv

A Go package exposing encoder and decoders that support data streaming to and from multiple formats.
Go
49
star
79

cli

Go package providing high-level constructs for command-line tools.
Go
48
star
80

facade

Providing common fields for analytics integrations, since 2013.
JavaScript
47
star
81

agecache

An LRU cache with support for max age
Go
47
star
82

validate-form

Easily validate a form element against a set of rules.
JavaScript
44
star
83

go-stats

Go stats ticker utility
Go
44
star
84

go-snakecase

Faster snakecase implementation
Go
43
star
85

utm-params

parse and get all utm parameters
JavaScript
42
star
86

aws-billing

An API to learn how much your AWS hosting costs every month
JavaScript
39
star
87

action-destinations

Action Destinations are the new way to build streaming destinations on Segment.
TypeScript
38
star
88

testdemo

Examples for https://segment.com/blog/5-advanced-testing-techniques-in-go/
Go
38
star
89

data-digger

Dig through structured messages in Kafka, S3, or local files
Go
37
star
90

segment-docs

Segment Documentation. Powered by Jekyll.
HTML
36
star
91

feature

Feature gate database designed for simplicity and efficiency.
Go
36
star
92

redis-go

Go package providing tools for building redis clients, servers and middleware.
Go
36
star
93

http_to_nsq

Publishes HTTP requests to NSQD (for CI webhooks etc)
Go
36
star
94

analytics.js-integration

The base integration factory used to create custom analytics integrations for analytics.js.
JavaScript
35
star
95

ebs-backup

Backup EBS Volumes
Go
34
star
96

Analytics.Xamarin

Analytics for Xamarin, a portable class library supporting iOS, Android, Mac OS, and others.
C#
34
star
97

go-hll

Go implementation of HLL that plays nicely with other languages
Go
34
star
98

terraform-segment-data-lakes

Terraform modules which create AWS resources for a Segment Data Lake.
HCL
34
star
99

analytics-kotlin

The hassle-free way to add Segment analytics to your Kotlin app (Android/JVM).
Kotlin
32
star
100

errors-go

Go package providing various error handling primitives.
Go
32
star