• Stars
    star
    2,836
  • Rank 16,030 (Top 0.4 %)
  • Language
    C++
  • License
    Other
  • Created over 10 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.

WDT Warp speed Data Transfer

Join the chat at https://gitter.im/facebook/wdt

Build Status

Design philosophy/Overview

Goal: Lowest possible total transfer time - to be only hardware limited (disc or network bandwidth not latency) and as efficient as possible (low CPU/memory/resources utilization)

We keep dependencies minimal in order to maximize portability and ensure a small binary size. As a bonus, this also minimizes compile time.

We aren't using exceptions for performance reasons and because using exceptions would make it harder to reason about the control flow of the library. We also believe the WDT library is easier to integrate as a result. Our philosophy is to write moderately structured and encapsulated C code as opposed to using every feature of C++.

We try to minimize the number of system calls, which is one of the reasons we are using blocking thread IOs. We can maximize system throughput because at any given point some threads are reading while others are writing, and data is buffered on both paths - keeping each subsystem busy while minimizing kernel to userspace switches.

Terminology

WDT uses "Mbytes" everywhere in its output as 1024*1024 bytes = 1048576 bytes (technically this should be the new mebibyte (MiB) standard but it felt Mbytes is be more in line with what other tools are using, clearer, easier to read and matching what a traditional "megabyte" used to mean in historical memory units where the address lines are binary and thus power of two and not of ten)

Example

While WDT is primarily a library, we also have a small command line tool which we use for tests and which is useful by itself. Here is a quick example:

Receiver side: (starts the server indicating destination directory)

[ldemailly@devbig074]$ wdt -directory /data/users/ldemailly/transfer1

Sender side: (discover and sends all files in a directory tree to destination)

[root@dev443]$ wdt -directory /usr/bin -destination devbig074.prn2

[=================================================] 100% 588.8 Mbytes/s
I0720 21:48:08.446014 3245296 Sender.cpp:314] Total sender time = 2.68699
seconds (0.00640992 dirTime). Transfer summary : Transfer status = OK. Number
of files transferred = 1887. Data Mbytes = 1582.08. Header Kbytes = 62.083
(0.00383215% overhead). Total bytes = 1658999858. Wasted bytes due to
failure = 0 (0% overhead). Total sender throughput = 588.816 Mbytes/sec
(590.224 Mbytes/sec pure transf rate)

Note that in this simple example with lots of small files (/usr/bin from a linux distribution), but not much data (~1.5Gbyte), the maximum speed isn't as good as it would with more data (as there is still a TCP ramp up time even though it's faster because of parallelization) like when we use it in our production use cases.

Performance/Results

In an internal use at Facebook to transfer RocksDB snapshot between hosts we are able to transfer data at a throttled 600 Mbytes/sec even across long distance, high latency links (e.g. Sweden to Oregon). That's 3x the speed of the previous highly optimized HTTP-based solution and with less strain on the system. When not throttling, we are able to easily saturate a 40 Gbit/s NIC and get near theoretical link speed (above 4 Gbytes/sec).

We have so far optimized WDT for servers with fast IOs - in particular flash card or in-memory read/writes. If you use disks throughput won't be as good, but we do plan on optimizing for disks as well in the future.

Dependencies

CMake for building WDT - See build/BUILD.md

gflags (google flags library) but only for the command line, the library doesn't depend on that

gtest (google testing) but only for tests

glog (google logging library) - use W*LOG macros so everything logged by WDT is always prefixed by "wdt>" which helps when embedded in another service

Parts of Facebook's Folly open source library (as set in the CMakefile) Mostly conv, threadlocal and checksum support.

For encryption, the crypto lib part of openssl-1.x

You can build and embed wdt as a library with as little as a C++11 compiler and glog - and you could macro away glog or replace it by printing to stderr if needed.

Code layout

Directories

  • top level Main WDT classes and Wdt command line source, CMakeLists.txt

  • util/ Utilities used for implementing the main objects

  • test/ Tests files and scripts

  • build/ Build related scripts and files and utils

  • fbonly/ Stuff specific to facebook/ (not in open source version)

  • bench/ Benchmark generation tools

Main files

  • CMakeLists.txt, .travis.yml, build/BUILD.md,travis_linux.sh,travis_osx.sh Build definition file - use CMake to generate a Makefile or a project file for your favorite IDE - details in build/BUILD.md

  • wdtCmdline.cpp

Main program which allows to have a server or client process to exercise the library (for end 2 end test as well as a standalone utility)

  • wcp.sh

A script to use wdt like scp for single big files - pending splitting support inside wdt proper the script does the splitting for you. install as "wcp".

  • WdtOptions.{h|cpp}

To specify the behavior of wdt. If wdt is used as a library, then the caller get the mutable object of options and set different options accordingly. When wdt is run in a standalone mode, behavior is changed through gflags in wdtCmdLine.cpp

  • WdtThread.{h|cpp} Common functionality and settings between SenderThread and ReceiverThread. Both of these kind of threads inherit from this base class.

  • WdtBase.{h|cpp}

Common functionality and settings between Sender and Receiver

  • WdtResourceController.{h|cpp}

Optional factory for Sender/Receiver with limit on number being created.

Producing/Sending

  • ByteSource.h

Interface for a data element to be sent/transferred

  • FileByteSource.{h|cpp}

Implementation/concrete subclass of ByteSource for a file identified as a relative path from a root dir. The identifier (path) sent remotely is the relative path

  • SourceQueue.h

Interface for producing next ByteSource to be sent

  • DirectorySourceQueue.{h|cpp}

Concrete implementation of SourceQueue producing all the files in a given directory, sorted by decreasing size (as they are discovered, you can start pulling from the queue even before all the files are found, it will return the current largest file)

  • ThreadTransferHistory.{h|cpp}

Every thread maintains a transfer history so that when a connection breaks it can talk to the receiver to find out up to where in the history has been sent. This class encapsulates all the logic for that bookkeeping

  • SenderThread.{h|cpp}

Implements the functionality of one sender thread, which binds to a certain port and sends files over.

  • Sender.{h|cpp}

Spawns multiple SenderThread threads and sends the data across to receiver

Consuming / Receiving

  • FileCreator.{h|cpp}

Creates file and directories necessary for said file (mkdir -p like)

  • ReceiverThread.{h|cpp}

Implements the functionality of the receiver threads, responsible for listening on a port and receiving files over the network.

  • Receiver.{h|cpp}

Parent receiver class that spawns multiple ReceiverThread threads and receives data from a remote host

Low level building blocks

  • ServerSocket.{h|.cpp}

Encapsulate a server socket listening on a port and giving a file descriptor to be used to communicate with the client

  • ClientSocket.{h|cpp}

Client socket wrapper - connection to a server port -> fd

  • Protocol.{h|cpp}

Decodes/Encodes meta information needed to interpret the data stream: the id (file path) and size (byte length of the data)

  • SocketUtils.{h|cpp}

Common socket related utilities (both client/server, sender/receiver side use)

  • Throttler.{h|cpp}

Throttling code

  • ErrorCodes.h

Header file for error codes

  • Reporting.{h|cpp}

Class representing transfer stats and reports

Future development/extensibility

The current implementation works well and has high efficiency. It is also extensible by implementing different byte sources both in and out. But inserting processing units isn't as easy.

For that we plan on restructuring the code to use a Zero copy stream/buffer pipeline: To maintain efficiency, the best overall total transfer time and time to first byte we can see WDT's internal architecture as chainable units

[Disk/flash/Storage IO] -> [Compression] -> [Protocol handling] -> [Encryption] -> [Network IO]

And the reverse chain on the receiving/writing end The trick is the data is variable length input and some units can change length and we need to process things by blocks Constraints/Design:

  • No locking / contention when possible
  • (Hard) Limits on memory used
  • Minimal number of copies/moving memory around
  • Still works the same for simple read file fd -> control -> write socked fd current basic implementation

Possible Solution(?) API:

  • Double linked list of Units
  • read/pull from left (pull() ?)
  • push to the right (push() ?)
  • end of stream from left
  • propagate last bytes to right

Can still be fully synchronous / blocking, works thanks to eof handling (synchronous gives us lock free/single thread - internally a unit is free to use parallelization like the compression stage is likely to want/need)

Another thing we touched on is processing chunks out of order - by changing header to be ( fileid, offset, size ) instead of ( filename, size ) and assuming everything is following in 1 continuous block (will also help the use case of small number of large files/chunks) : mmap'in the target/destination file The issue then is who creates it in what order - similar to the directory creation problem - we could use a meta info channel to avoid locking/contention but that requires synchronization

We want things to work with even up to 1 second latency without incurring a 1 second delay before we send the first payload byte

Submitting diffs/making changes

See CONTRIBUTING.md

Please run the tests

CTEST_OUTPUT_ON_FAILURE=1 make test

And ideally also the manual tests (integration/porting upcoming)

wdt_e2e_test.sh wdt_download_resumption_test.sh wdt_network_test.sh wdt_max_send_test.sh

(facebook only:) Make sure to do the following, before "arc diff":

 (cd wdt ; ./build/clangformat.sh )
 # if you changed the minor version of the protocol (in CMakeLists.txt)
 # run (cd wdt ; ./build/version_update.tcl ) to sync with fbcode's WdtConfig.h

 fbconfig  --clang --sanitize=address -r  wdt

 fbmake runtests --run-disabled --extended-tests
 # Optionally: opt build
 fbmake runtests_opt
 fbmake opt
 # Sender max speed test
 wdt/test/wdt_max_send_test.sh
 # Check buck build
 buck build wdt/...
 # Debug a specific test with full output even on success:
 buck test wdt:xxx -- --run-disabled --extended-tests --print-passing-details\
   --print-long-results

and check the output of the last step to make sure one of the 3 runs is still above 20,000 Mbytes/sec (you may need to make sure you /dev/shm is mostly empty to get the best memory throughput, as well as not having a ton of random processes running during the test)

Also :

  • Update this file
  • Make sure your diff has a task
  • Put (relevant) log output of sender/receiver in the diff test plan or comment
  • Depending on the changes
    • Perf: wdt/wdt_e2e_test.sh has a mix of ~ > 700 files, > 8 Gbytes/sec
    • do run remote network tests (wdt/wdt_remote_test.sh)
    • do run profiler and check profile results (wdt/fbonly/wdt_prof.sh) 80k small files at > 1.6 Gbyte/sec

More Repositories

1

react

The library for web and native user interfaces.
JavaScript
227,971
star
2

react-native

A framework for building native applications using React
C++
118,682
star
3

create-react-app

Set up a modern web app by running one command.
JavaScript
101,913
star
4

docusaurus

Easy to maintain open source documentation websites.
TypeScript
56,059
star
5

jest

Delightful JavaScript Testing.
TypeScript
41,554
star
6

rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
C++
28,328
star
7

folly

An open-source C++ library developed and used at Facebook.
C++
27,122
star
8

zstd

Zstandard - Fast real-time compression algorithm
C
22,448
star
9

flow

Adds static typing to JavaScript to improve developer productivity and code quality.
OCaml
22,068
star
10

lexical

Lexical is an extensible text editor framework that provides excellent reliability, accessibility and performance.
TypeScript
19,616
star
11

relay

Relay is a JavaScript framework for building data-driven React applications.
Rust
18,191
star
12

hhvm

A virtual machine for executing programs written in Hack.
Hack
18,048
star
13

prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Python
17,943
star
14

fresco

An Android library for managing images and the memory they use.
Java
17,041
star
15

yoga

Yoga is an embeddable layout engine targeting web standards.
C++
16,928
star
16

infer

A static analyzer for Java, C, C++, and Objective-C
OCaml
14,715
star
17

flipper

A desktop debugging platform for mobile developers.
TypeScript
13,221
star
18

watchman

Watches files and records, or triggers actions, when they change.
C++
12,294
star
19

react-devtools

An extension that allows inspection of React component hierarchy in the Chrome and Firefox Developer Tools.
11,030
star
20

hermes

A JavaScript engine optimized for running React Native.
C++
9,388
star
21

jscodeshift

A JavaScript codemod toolkit.
JavaScript
9,270
star
22

chisel

Chisel is a collection of LLDB commands to assist debugging iOS apps.
Python
9,090
star
23

buck

A fast build system that encourages the creation of small, reusable modules over a variety of platforms and languages.
Java
8,568
star
24

stylex

StyleX is the styling system for ambitious user interfaces.
JavaScript
8,333
star
25

proxygen

A collection of C++ HTTP libraries including an easy to use HTTP server.
C++
8,026
star
26

facebook-ios-sdk

Used to integrate the Facebook Platform with your iOS & tvOS apps.
Swift
7,720
star
27

litho

A declarative framework for building efficient UIs on Android.
Java
7,646
star
28

pyre-check

Performant type-checking for python.
OCaml
6,696
star
29

facebook-android-sdk

Used to integrate Android apps with Facebook Platform.
Kotlin
6,066
star
30

redex

A bytecode optimizer for Android apps
C++
5,991
star
31

sapling

A Scalable, User-Friendly Source Control System.
Rust
5,815
star
32

componentkit

A React-inspired view framework for iOS.
Objective-C++
5,746
star
33

fishhook

A library that enables dynamically rebinding symbols in Mach-O binaries running on iOS.
C
5,117
star
34

PathPicker

PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything. After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
Python
5,075
star
35

metro

πŸš‡ The JavaScript bundler for React Native
JavaScript
5,061
star
36

prop-types

Runtime type checking for React props and similar objects
JavaScript
4,446
star
37

idb

idb is a flexible command line interface for automating iOS simulators and devices
Objective-C
4,431
star
38

Haxl

A Haskell library that simplifies access to remote data, such as databases or web-based services.
Haskell
4,227
star
39

FBRetainCycleDetector

iOS library to help detecting retain cycles in runtime.
Objective-C++
4,190
star
40

memlab

A framework for finding JavaScript memory leaks and analyzing heap snapshots
TypeScript
4,187
star
41

duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Haskell
4,021
star
42

fbt

A JavaScript Internationalization Framework
JavaScript
3,849
star
43

regenerator

Source transformer enabling ECMAScript 6 generator functions in JavaScript-of-today.
JavaScript
3,817
star
44

buck2

Build system, successor to Buck
Rust
3,307
star
45

mcrouter

Mcrouter is a memcached protocol router for scaling memcached deployments.
C++
3,222
star
46

wangle

Wangle is a framework providing a set of common client/server abstractions for building services in a consistent, modular, and composable way.
C++
3,030
star
47

react-strict-dom

React Strict DOM (RSD) is a subset of React DOM, imperative DOM, and CSS that supports web and native targets
JavaScript
2,922
star
48

igl

Intermediate Graphics Library (IGL) is a cross-platform library that commands the GPU. It provides a single low-level cross-platform interface on top of various graphics APIs (e.g. OpenGL, Metal and Vulkan).
C++
2,719
star
49

fbthrift

Facebook's branch of Apache Thrift, including a new C++ server.
C++
2,535
star
50

mysql-5.6

Facebook's branch of the Oracle MySQL database. This includes MyRocks.
C++
2,446
star
51

Ax

Adaptive Experimentation Platform
Python
2,272
star
52

fbjs

A collection of utility libraries used by other Meta JS projects.
JavaScript
1,953
star
53

jsx

The JSX specification is a XML-like syntax extension to ECMAScript.
HTML
1,945
star
54

react-native-website

The React Native website and docs
JavaScript
1,899
star
55

screenshot-tests-for-android

Generate fast deterministic screenshots during Android instrumentation tests
Java
1,733
star
56

idx

Library for accessing arbitrarily nested, possibly nullable properties on a JavaScript object.
JavaScript
1,686
star
57

TextLayoutBuilder

An Android library that allows you to build text layouts more easily.
Java
1,470
star
58

mvfst

An implementation of the QUIC transport protocol.
C++
1,433
star
59

SoLoader

Native code loader for Android
Java
1,300
star
60

facebook-python-business-sdk

Python SDK for Meta Marketing APIs
Python
1,240
star
61

ThreatExchange

Trust & Safety tools for working together to fight digital harms.
C++
1,170
star
62

CacheLib

Pluggable in-process caching engine to build and scale high performance services
C++
1,097
star
63

mariana-trench

A security focused static analysis tool for Android and Java applications.
C++
1,041
star
64

fatal

Fatal is a library for fast prototyping software in modern C++. It provides facilities to enhance the expressive power of C++. The library is heavily based on template meta-programming, while keeping the complexity under-the-hood.
C++
1,000
star
65

transform360

Transform360 is an equirectangular to cubemap transform for 360 video.
C
996
star
66

openr

Distributed platform for building autonomic network functions.
C++
883
star
67

fboss

Facebook Open Switching System Software for controlling network switches.
C++
851
star
68

ktfmt

A program that reformats Kotlin source code to comply with the common community standard for Kotlin code conventions.
Kotlin
818
star
69

facebook-php-business-sdk

PHP SDK for Meta Marketing API
PHP
810
star
70

winterfell

A STARK prover and verifier for arbitrary computations
Rust
728
star
71

pyre2

Python wrapper for RE2
C++
631
star
72

starlark-rust

A Rust implementation of the Starlark language
Rust
623
star
73

openbmc

OpenBMC is an open software framework to build a complete Linux image for a Board Management Controller (BMC).
C
615
star
74

SPARTA

SPARTA is a library of software components specially designed for building high-performance static analyzers based on the theory of Abstract Interpretation.
C++
609
star
75

time

Meta's Time libraries
Go
570
star
76

chef-cookbooks

Open source chef cookbooks.
Ruby
565
star
77

IT-CPE

Meta's Client Platform Engineering tools. Some of the tools we have written to help manage our fleet of client systems.
Ruby
554
star
78

dotslash

Simplified executable deployment
Rust
523
star
79

Rapid

The OpenStreetMap editor driven by open data, AI, and supercharged features
JavaScript
515
star
80

lexical-ios

Lexical iOS is an extensible text editor framework that integrates the APIs and philosophies from Lexical Web with a Swift API built on top of TextKit.
Swift
477
star
81

facebook-sdk-for-unity

The facebook sdk for unity.
C#
474
star
82

facebook-nodejs-business-sdk

Node.js SDK for Meta Marketing APIs
JavaScript
469
star
83

FAI-PEP

Facebook AI Performance Evaluation Platform
Python
384
star
84

facebook-java-business-sdk

Java SDK for Meta Marketing APIs
Java
379
star
85

chef-utils

Utilities related to Chef
Ruby
290
star
86

opaque-ke

An implementation of the OPAQUE password-authenticated key exchange protocol
Rust
275
star
87

dns

Collection of Meta's DNS Libraries
Go
257
star
88

facebook360_dep

Facebook360 Depth Estimation Pipeline - https://facebook.github.io/facebook360_dep
HTML
241
star
89

akd

An implementation of an auditable key directory
Rust
219
star
90

tac_plus

A Tacacs+ Daemon tested on Linux (CentOS) to run AAA via TACACS+ Protocol via IPv4 and IPv6.
C
207
star
91

facebook-ruby-business-sdk

Ruby SDK for Meta Marketing API
Ruby
204
star
92

usort

Safe, minimal import sorting for Python projects.
Python
171
star
93

grocery-delivery

The Grocery Delivery utility for managing cookbook uploads to distributed Chef backends.
Ruby
153
star
94

taste-tester

Software to manage a chef-zero instance and use it to test changes on production servers.
Ruby
146
star
95

TestSlide

A Python test framework
Python
143
star
96

sapp

Post Processor for Facebook Static Analysis Tools.
Python
127
star
97

homebrew-fb

OS X Homebrew formulas to install Meta open source software
Ruby
124
star
98

threat-research

Welcome to the Meta Threat Research Indicator Repository, a dedicated resource for the sharing of Indicators of Compromise (IOCs) and other threat indicators with the external research community
Python
124
star
99

ocamlrep

Sets of libraries and tools to write applications and libraries mixing OCaml and Rust. These libraries will help keeping your types and data structures synchronized, and enable seamless exchange between OCaml and Rust
Rust
121
star
100

squangle

SQuangLe is a C++ API for accessing MySQL servers
C++
121
star