• This repository has been archived on 16/May/2023
  • Stars
    star
    1,040
  • Rank 44,297 (Top 0.9 %)
  • Language
    C++
  • License
    MIT License
  • Created almost 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.

Bistro: A fast, flexible toolkit for scheduling and running distributed tasks

Build Status

This README is a very abbreviated introduction to Bistro. Visit http://facebook.github.io/bistro for a more structured introduction, and for the docs.

Bistro is a toolkit for making distributed computation systems. It can schedule and run distributed tasks, including data-parallel jobs. It enforces resource constraints for worker hosts and data-access bottlenecks. It supports remote worker pools, low-latency batch scheduling, dynamic shards, and a variety of other possibilities. It has command-line and web UIs.

Some of the diverse problems that Bistro solved at Facebook:

  • Safely run map-only ETL tasks against live production databases (MySQL, HBase, Postgres).
  • Provide a resource-aware job queue for batch CPU/GPU compute jobs.
  • Replace Hadoop for a periodic online data compression task on HBase, improving time-to-completion and reliability by over 10x.

You can run Bistro "out of the box" to suit a variety of different applications, but even so, it is a tool for engineers. You should be able to get started just by reading the documentation, but when in doubt, look at the code --- it was written to be read.

Some applications of Bistro may involve writing small plugins to make it fit your needs. The code is built to be extensible. Ask for tips, and we'll do our best to help. In return, we hope that you will send a pull request to allow us to share your work with the community.

Early release

Although Bistro has been in production at Facebook for over 3 years, the present public release is partial, including just the server components.

Install the dependencies and build

Bistro needs a 64-bit Linux, Folly, FBThrift, Proxygen, boost, and libsqlite3. You need 2-3GB of RAM to build, as well as GCC 4.9 or above.

build/README.md documents the usage of Docker-based scripts that build Bistro on Ubuntu 14.04, 16.04, and Debian 8.6. You should be able to follow very similar steps on most modern Linux distributions.

If you run into dependency problems, look at bistro/cmake/setup.cmake for a full list of Bistro's external dependencies (direct and indirect). We gratefully accept patches that improve Bistro's builds, or add support for various flavors of Linux and Mac OS.

The binaries will be in bistro/cmake/{Debug,Release}. Available build targets are explained here: http://cmake.org/Wiki/CMake_Useful_Variables#Compilers_and_Tools You can start Bistro's unit tests by running ctest in those directories.

Your first Bistro run

This is just one simple demo, but Bistro is a very flexible tool. Refer to http://facebook.github.io/bistro/ for more in-depth information.

We are going to start a single Bistro scheduler talking to one 'remote' worker.

Aside: The scheduler tracks jobs, and data shards on which to execute them. It also makes sure only to start new tasks when the required resources are available. The remote worker is a module for executing centrally scheduled work on many machines. The UI can aggregate many schedulers at once, so using remote workers is optional --- a share-nothing, many-scheduler system is sometimes preferable.

Let's make a task to execute:

cat <<EOF > ~/demo_bistro_task.sh
#!/bin/bash
echo "I got these arguments: \$@"
echo "stderr is also logged" 1>&2
echo "done" > "\$2"  # Report the task status to Bistro via a named pipe
EOF
chmod u+x ~/demo_bistro_task.sh

Open two terminals, one for the scheduler, and one for the worker.

# In both terminals
cd bistro/bistro
# Start the scheduler in one terminal
./cmake/Debug/server/bistro_scheduler \
  --server_port=6789 --http_server_port=6790 \
  --config_file=scripts/test_configs/simple --clean_statuses \
  --CAUTION_startup_wait_for_workers=1 --instance_node_name=scheduler
# Start the worker in another
mkdir /tmp/bistro_worker
./cmake/Debug/worker/bistro_worker --server_port=27182 --scheduler_host=:: \
  --scheduler_port=6789 --worker_command="$HOME/demo_bistro_task.sh" \
  --data_dir=/tmp/bistro_worker

You should be seeing some lively log activity on both terminals. In several seconds, the worker-scheduler negotiation should complete, and you should see messages like "Task ... quit with status" and "Got status".

Since we passed --clean_statuses, the scheduler will not persist any task completions that happened during this run. The worker, on the other hand, will keep a record of the task logs in /tmp/bistro_worker/task_logs.sql3.

If you want task completions to persist across runs, tell Bistro where to put the SQLite database, via --data_dir=/tmp/bistro_scheduler and --status_table=task_statuses

mkdir /tmp/bistro_scheduler
./cmake/Debug/server/bistro_scheduler \
  --server_port=6789 --http_server_port=6790 \
  --config_file=scripts/test_configs/simple \
  --data_dir=/tmp/bistro_scheduler --status_table=task_statuses \
  --CAUTION_startup_wait_for_workers=1 --instance_node_name=scheduler

You can query the running scheduler via its REST API:

curl -d '{"a":{"handler":"jobs"},"b":{"handler":"running_tasks"}}' :::6790
curl -d '{"my subquery":{"handler":"task_logs","log_type":"stdout"}}' :::6790

Pro-tip: For ease of reading, pipe the output through either jq or json_pp (from a Perl package). For longer outputs, try | jq -C . | less -R.

You should also take a look at the scheduler configuration to see how its jobs, nodes, and resources were specified.

less scripts/test_configs/simple

For debugging, we typically invoke the binaries like this:

gdb cmake/Debug/worker/bistro_worker -ex "r ..." 2>&1 | tee WORKER.txt

When configuring a real deployment, be sure to carefully review the --help of the scheduler & worker binaries, as well as the documentation on http://facebook.github.io/bistro. And don't hesitate to ask for help in the group: https://www.facebook.com/groups/bistro.scheduler

License

See LICENSE.

More Repositories

1

draft-js

A React framework for building text editors.
JavaScript
22,506
star
2

pop

An extensible iOS and OS X animation library, useful for physics-based interactions.
Objective-C++
19,716
star
3

flux

Application Architecture for Building User Interfaces
JavaScript
17,397
star
4

prepack

A JavaScript bundle optimizer.
JavaScript
14,271
star
5

AsyncDisplayKit

Smooth asynchronous user interfaces for iOS apps.
Objective-C++
13,447
star
6

stetho

Stetho is a debug bridge for Android applications, enabling the powerful Chrome Developer Tools and much more.
Java
12,653
star
7

Shimmer

An easy way to add a simple, shimmering effect to any view in an iOS app.
Objective-C
9,375
star
8

react-360

Create amazing 360 and VR content using React
JavaScript
8,702
star
9

caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
Shell
8,420
star
10

nuclide

An open IDE for web and native mobile development, built on top of Atom
JavaScript
7,816
star
11

KVOController

Simple, modern, thread-safe key-value observing for iOS and OS X.
Objective-C
7,359
star
12

three20

Three20 is an Objective-C library for iPhone developers
Objective-C
7,265
star
13

xctool

An extension for Apple's xcodebuild that makes it easier to test iOS and macOS apps.
Objective-C
6,954
star
14

fbctf

Platform to host Capture the Flag competitions
Hack
6,495
star
15

rebound

A Java library that models spring dynamics and adds real world physics to your app.
Java
5,444
star
16

Keyframes

A library for converting Adobe AE shape based animations to a data format and playing it back on Android and iOS devices.
JavaScript
5,343
star
17

shimmer-android

An easy, flexible way to add a shimmering effect to any view in an Android app.
Java
5,265
star
18

grace

Graceful restart & zero downtime deploy for Go servers.
Go
4,899
star
19

Tweaks

An easy way to fine-tune, and adjust parameters for iOS apps in development.
Objective-C
4,751
star
20

augmented-traffic-control

Augmented Traffic Control: A tool to simulate network conditions
Python
4,331
star
21

fixed-data-table

A React table component designed to allow presenting thousands of rows of data.
JavaScript
4,314
star
22

WebDriverAgent

A WebDriver server for iOS that runs inside the Simulator.
Objective-C
4,096
star
23

huxley

A testing system for catching visual regressions in Web applications.
Python
4,086
star
24

codemod

Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention. Codemod was developed at Facebook and released as open source.
Python
4,069
star
25

scribe

Scribe is a server for aggregating log data streamed in real time from a large number of servers.
C++
3,932
star
26

FBMemoryProfiler

iOS tool that helps with profiling iOS Memory usage.
Objective-C
3,417
star
27

mention-bot

Automatically mention potential reviewers on pull requests.
JavaScript
3,371
star
28

facebook-php-sdk

This SDK is deprecated. Find the new SDK here: https://github.com/facebook/facebook-php-sdk-v4
PHP
3,289
star
29

origami

A Quartz Composer framework that enables interactive design prototyping without programming.
Objective-C
3,280
star
30

RakNet

RakNet is a cross platform, open source, C++ networking engine for game programmers.
HTML
3,211
star
31

network-connection-class

Listen to current network traffic in the app and categorize the quality of the network.
Java
3,178
star
32

beringei

Beringei is a high performance, in-memory storage engine for time series data.
C++
3,159
star
33

php-graph-sdk

The Facebook SDK for PHP provides a native interface to the Graph API and Facebook Login. https://developers.facebook.com/docs/php
PHP
3,146
star
34

react-native-fbsdk

A React Native wrapper around the Facebook SDKs for Android and iOS. Provides access to Facebook login, sharing, graph requests, app events etc.
Java
2,993
star
35

python-instagram

Python Client for Instagram API
Python
2,966
star
36

conceal

Conceal provides easy Android APIs for performing fast encryption and authentication of data.
C++
2,966
star
37

webscalesql-5.6

WebScaleSQL, Version 5.6, based upon the MySQL-5.6 community releases.
C++
2,954
star
38

ios-snapshot-test-case

Snapshot view unit tests for iOS
Objective-C
2,674
star
39

device-year-class

A library that analyzes an Android device's specifications and calculates which year the device would be considered "high endโ€.
Java
2,581
star
40

BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2,497
star
41

pfff

Tools for code analysis, visualizations, or style-preserving source transformation.
OCaml
2,439
star
42

fb.resnet.torch

Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
Lua
2,243
star
43

redux-react-hook

React Hook for accessing state and dispatch from a Redux store
TypeScript
2,164
star
44

Surround360

Surround360 is Facebook's open source hardware and software for capturing stereoscopic 3D 360 video for VR. The repo contains hardware designs, as well as software for camera control and rendering.
C++
2,153
star
45

xcbuild

Xcode-compatible build tool.
C++
2,000
star
46

LogDevice

Distributed storage for sequential data
C++
1,888
star
47

MemNN

Memory Networks implementations
Lua
1,757
star
48

rebound-js

Spring dynamics in JavaScript.
JavaScript
1,754
star
49

redis-faina

A query analyzer that parses Redis' MONITOR command for counter/timing stats about query patterns
Python
1,749
star
50

fb-flo

A Chrome extension that lets you modify running apps without reloading them.
JavaScript
1,692
star
51

planout

PlanOut is a library and interpreter for designing online experiments.
JavaScript
1,664
star
52

libphenom

An eventing framework for building high performance and high scalability systems in C.
C
1,662
star
53

flashcache

A general purpose, write-back block cache for Linux.
C
1,601
star
54

python-nubia

A command-line and interactive shell framework.
Python
1,595
star
55

profilo

A library for performance traces from production.
C
1,577
star
56

facebook-swift-sdk

Integrate your iOS apps in Swift with Facebook Platform.
Swift
1,519
star
57

instagram-ruby-gem

The official gem for the Instagram API
Ruby
1,461
star
58

inject

Package inject provides a reflect based injector.
Go
1,393
star
59

Flicks

A unit of time defined in C++.
C++
1,388
star
60

duckling_old

Deprecated in favor of https://github.com/facebook/duckling
Clojure
1,322
star
61

connect-js

Legacy JavaScript SDK
JavaScript
1,237
star
62

atom-in-orbit

Putting Atom in the browser
JavaScript
1,183
star
63

phpsh

A read-eval-print-loop for php
Emacs Lisp
1,160
star
64

C3D

C3D is a modified version of BVLC caffe to support 3D ConvNets.
Jupyter Notebook
1,159
star
65

sublime-react

Sublime Text helpers for React. Syntax highlighting DEPRECATED in favor of babel/babel-sublime
JavaScript
1,144
star
66

fb-adb

A better shell for Android devices
C
1,139
star
67

iTorch

IPython kernel for Torch with visualization and plotting
Jupyter Notebook
1,104
star
68

FBAllocationTracker

iOS library that helps tracking all allocated Objective-C objects
Objective-C++
1,094
star
69

fbcunn

Facebook's extensions to torch/cunn.
Lua
1,069
star
70

emitter

A JS EventEmitter foundation for evented code
JavaScript
1,041
star
71

relay-starter-kit

Barebones starting point for a Relay application.
JavaScript
1,017
star
72

torchnet

Torch on steroids
Lua
992
star
73

react-meteor

React rendering for Meteor apps
JavaScript
953
star
74

atom-ide-ui

A collection of user interfaces for Atom IDE.
JavaScript
936
star
75

NAMAS

Neural Attention Model for Abstractive Summarization
Lua
910
star
76

nifty

Thrift on Netty
Java
899
star
77

swift

An annotation-based Java library for creating Thrift serializable types and services.
Java
889
star
78

bAbI-tasks

Task generation for testing text understanding and reasoning
Lua
886
star
79

hadoop-20

Facebook's Realtime Distributed FS based on Apache Hadoop 0.20-append
Java
876
star
80

loop

A method to generate speech across multiple speakers
Python
872
star
81

IGInterfaceDataTable

A category on WKInterfaceTable that makes configuring tables with multi-dimensional data easier.
Objective-C
837
star
82

mononoke

A Mercurial source control server, specifically designed to support large monorepos.
822
star
83

react-page

Easy Application Development with React JavaScript
JavaScript
795
star
84

f8DeveloperConferenceApp

[Archive] f8 2014 Conference App
HTML
761
star
85

nailgun

Nailgun is a client, protocol, and server for running Java programs from the command line without incurring the JVM startup overhead.
Java
734
star
86

WEASEL

DNS covert channel implant for Red Teams.
Python
725
star
87

RiftDK1

Firmware, Schematics, and Mechanicals for the Oculus Rift Development Kit 1
C
688
star
88

jcommon

concurrency, collections, stats/analytics, config, testing, etc
Java
677
star
89

proguard

A fork of ProGuard.
Java
661
star
90

bootstrapped

Generate bootstrapped confidence intervals for A/B testing in Python.
Python
631
star
91

ig-lazy-module-loader

Library that implements module lazy loading.
Java
630
star
92

opencompute

A community of engineers whose mission is to design and enable the delivery of the most efficient server, storage and data center hardware designs for scalable computing.
TeX
624
star
93

flint

An open-source lint program for C++ developed by, and formerly used at Facebook.
D
622
star
94

fblualib

Facebook libraries and utilities for Lua
Lua
615
star
95

remodel

Remodel is a tool that helps iOS and OS X developers avoid repetitive code by generating Objective-C models that support coding, value comparison, and immutability.
TypeScript
609
star
96

eyescream

natural image generation using ConvNets
Lua
599
star
97

react-python

Python bridge to JSX & the React JavaScript library.
Python
576
star
98

spacetime

Experimental iOS library for live transformations on parts of layers.
Objective-C
528
star
99

warp

A fast preprocessor for C and C++
D
521
star
100

FBNotifications

Facebook Analytics In-App Notifications Framework
Objective-C
494
star