• Stars
    star
    1
  • Language
    Python
  • Created almost 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Workspaces 2.0 demo

More Repositories

1

kafka

A high-throughput, distributed, publish-subscribe messaging system
Java
58
star
2

flink

Scalable Batch and Stream Data Processing
Java
24
star
3

fbtftp

fbtftp is Facebook's implementation of a dynamic TFTP server framework
Python
15
star
4

beam

Unified programming model to create a data processing pipelines for batch and streaming models
Python
9
star
5

caffe

Caffe: a fast open framework for deep learning
C++
7
star
6

mixer

Mixed Incremental Cross-Entropy REINFORCE ICLR 2016
Lua
7
star
7

fboss

Facebook Open Switching System Software for controlling network switches
C++
6
star
8

fasttext

Library for fast text representation and classification
HTML
6
star
9

felix

Project Calico core repository
Python
6
star
10

bistro

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
C++
6
star
11

heapster

Compute Resource Usage Analysis and Monitoring of Container Clusters
Go
5
star
12

hudi

Upserts And Incremental Processing on Big Data
Java
5
star
13

helm

The Kubernetes Package Manager
Go
5
star
14

darkforestgo

DarkForest, the Facebook Go engine
C
5
star
15

charts

Curated applications for Kubernetes using Helm charts with integrated Deployment Manager templates
Shell
5
star
16

commai-env

A platform for developing AI systems as described in A Roadmap towards Machine Intelligence - http://arxiv.org/abs/1511.08130
Python
5
star
17

multipathnet

A Torch implementation of the object detection network from "A MultiPath Network for Object Detection" (https://arxiv.org/abs/1604.02135)
Lua
5
star
18

deepmask

Torch implementation of DeepMask and SharpMask
Lua
5
star
19

kubernetes

Production-Grade Container Scheduling and Management
Go
5
star
20

kubernetes-cluster-federation

Kubernetes cluster federation tutorial
Shell
5
star
21

drill

Schema-free SQL for Hadoop, NoSQL and Cloud Storage
Java
5
star
22

torch

A Scientific Computing Framework for Luajit
Jupyter Notebook
5
star
23

pysparnn

Approximate Nearest Neighbor Search for Sparse Data in Python
Python
4
star
24

avro

Apache Avro
Java
3
star
25

airflow

Apache Airflow
Python
3
star
26

horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.
Python
3
star
27

data-platform-ai

Data Platform for Large Scale Data Processing and AI & Machine Learning/Deep Learning
2
star
28

hive

Apache Hive
Java
2
star
29

presto

Distributed SQL query engine for big data https://prestodb.io
Java
2
star
30

protobuf

Protocol Buffers - Google's data interchange format
C++
2
star
31

sereal

Fast, compact, schema-less, binary serialization and deserialization oriented towards dynamic languages
C
2
star
32

kuryr

Container and Orchestration remote driver for OpenStack Neutron
Python
2
star
33

kafka-rest-node

Node.js client for the Kafka REST proxy
JavaScript
2
star
34

kafka-examples

Applications, templates and code examples for Apache Kafka
Java
2
star
35

spark

Apache Spark
Scala
2
star
36

druid

Apache Druid (Incubating) - Column oriented distributed data store ideal for powering interactive applications
Java
2
star
37

rokku

Rokku project. This projects acts as a proxy on top of any S3 storage solution providing services like authentication, authorisation, short-term tokens and lineage.
Scala
2
star
38

kafka-ansible

Ansible playbooks for the Kafka
Shell
1
star
39

docker-spark

Apache Spark docker image
Shell
1
star
40

message-backbone

Message queue backbone for event handling
1
star
41

docker-xserver

Docker Image with Xserver, OpenBLAS and correct user settings
Shell
1
star
42

calcite

Apache Calcite
Java
1
star
43

kafka-connect-jdbc

Kafka Connect connector for JDBC-compatible databases
Java
1
star
44

kafka-connect-storage-common

Shared software among connectors that target distributed filesystems and cloud storage
Java
1
star
45

marmaray

Generic Data Ingestion & Dispersal Library for Hadoop
Java
1
star
46

gitolly

Clone all of your Github repositories from the command line using Python
Python
1
star
47

parquet-mr

Apache Parquet
Java
1
star
48

kafka-python

Kafka Python client
C
1
star
49

docker-torch-jupyter

Docker image for deep learning with Torch and Jupyter
1
star
50

kafka-connect-elasticsearch

Kafka Connect Elasticsearch connector
Java
1
star
51

manifests

Deploy manifests for the single-container version of Netsil AOC
Shell
1
star
52

pinot

Apache Pinot - A realtime distributed OLAP datastore
Java
1
star
53

kafka-connect-hdfs

Kafka Connect HDFS connector
Java
1
star
54

baker

Orchestrate microservice-based process flows
Scala
1
star
55

docker-deeplearning

Information and scripts to run and develop Deep Learning Docker containers
Shell
1
star
56

caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework
Jupyter Notebook
1
star
57

clocks

Time, Clocks, and the Ordering of Events
Python
1
star
58

kafka-rest-utils

Utilities and a small framework for building REST services with Jersey, Jackson, and Jetty.
Java
1
star
59

kafka-schema-registry

Schema registry for Kafka
Java
1
star
60

kafka-rest

Confluent REST Proxy for Kafka
Java
1
star
61

pyhive

Python interface to Hive and Presto.
Python
1
star
62

torch-nn

Efficient, reusable RNNs and LSTMs for torch
Lua
1
star
63

kafka-go

Kafka Golang client
Go
1
star
64

utils

CLI, Scripts, etc.
Python
1
star
65

xgboost

Scalable, portable, and distributed Gradient Boosting (GBDT, GBRT or GBM) library, for Python, R, Java, Scala, C++ and more. Runs on single host, Hadoop, Spark, Flink and DataFlow
C++
1
star
66

peloton

Unified Resource Scheduler to co-schedule mixed types of workloads such as batch, stateless and stateful jobs in a single cluster for better resource utilization.
Go
1
star
67

petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Python
1
star
68

arx

ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.
Java
1
star
69

kafka-connect-storage-cloud

Kafka Connect suite of connectors for Cloud storage (currently including Amazon S3)
Java
1
star