There are no reviews yet. Be the first to send feedback to the community and the maintainers!
kafka
A high-throughput, distributed, publish-subscribe messaging systemflink
Scalable Batch and Stream Data Processingfbtftp
fbtftp is Facebook's implementation of a dynamic TFTP server frameworkbeam
Unified programming model to create a data processing pipelines for batch and streaming modelscaffe
Caffe: a fast open framework for deep learningmixer
Mixed Incremental Cross-Entropy REINFORCE ICLR 2016fboss
Facebook Open Switching System Software for controlling network switchesfasttext
Library for fast text representation and classificationfelix
Project Calico core repositorybistro
Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.heapster
Compute Resource Usage Analysis and Monitoring of Container Clustershudi
Upserts And Incremental Processing on Big Datahelm
The Kubernetes Package Managerdarkforestgo
DarkForest, the Facebook Go enginecharts
Curated applications for Kubernetes using Helm charts with integrated Deployment Manager templatescommai-env
A platform for developing AI systems as described in A Roadmap towards Machine Intelligence - http://arxiv.org/abs/1511.08130multipathnet
A Torch implementation of the object detection network from "A MultiPath Network for Object Detection" (https://arxiv.org/abs/1604.02135)deepmask
Torch implementation of DeepMask and SharpMaskkubernetes
Production-Grade Container Scheduling and Managementkubernetes-cluster-federation
Kubernetes cluster federation tutorialdrill
Schema-free SQL for Hadoop, NoSQL and Cloud Storagetorch
A Scientific Computing Framework for Luajitpysparnn
Approximate Nearest Neighbor Search for Sparse Data in Pythonavro
Apache Avroairflow
Apache Airflowhorovod
Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.hive
Apache Hivepresto
Distributed SQL query engine for big data https://prestodb.ioprotobuf
Protocol Buffers - Google's data interchange formatsereal
Fast, compact, schema-less, binary serialization and deserialization oriented towards dynamic languageskuryr
Container and Orchestration remote driver for OpenStack Neutronkafka-rest-node
Node.js client for the Kafka REST proxykafka-examples
Applications, templates and code examples for Apache Kafkaspark
Apache Sparkdruid
Apache Druid (Incubating) - Column oriented distributed data store ideal for powering interactive applicationsrokku
Rokku project. This projects acts as a proxy on top of any S3 storage solution providing services like authentication, authorisation, short-term tokens and lineage.kafka-ansible
Ansible playbooks for the Kafkaworkspaces
Workspaces 2.0 demodocker-spark
Apache Spark docker imagemessage-backbone
Message queue backbone for event handlingdocker-xserver
Docker Image with Xserver, OpenBLAS and correct user settingscalcite
Apache Calcitekafka-connect-jdbc
Kafka Connect connector for JDBC-compatible databaseskafka-connect-storage-common
Shared software among connectors that target distributed filesystems and cloud storagemarmaray
Generic Data Ingestion & Dispersal Library for Hadoopgitolly
Clone all of your Github repositories from the command line using Pythonparquet-mr
Apache Parquetkafka-python
Kafka Python clientdocker-torch-jupyter
Docker image for deep learning with Torch and Jupyterkafka-connect-elasticsearch
Kafka Connect Elasticsearch connectormanifests
Deploy manifests for the single-container version of Netsil AOCpinot
Apache Pinot - A realtime distributed OLAP datastorekafka-connect-hdfs
Kafka Connect HDFS connectorbaker
Orchestrate microservice-based process flowsdocker-deeplearning
Information and scripts to run and develop Deep Learning Docker containerscaffe2
Caffe2 is a lightweight, modular, and scalable deep learning frameworkclocks
Time, Clocks, and the Ordering of Eventskafka-rest-utils
Utilities and a small framework for building REST services with Jersey, Jackson, and Jetty.kafka-schema-registry
Schema registry for Kafkakafka-rest
Confluent REST Proxy for Kafkapyhive
Python interface to Hive and Presto.torch-nn
Efficient, reusable RNNs and LSTMs for torchkafka-go
Kafka Golang clientutils
CLI, Scripts, etc.xgboost
Scalable, portable, and distributed Gradient Boosting (GBDT, GBRT or GBM) library, for Python, R, Java, Scala, C++ and more. Runs on single host, Hadoop, Spark, Flink and DataFlowpeloton
Unified Resource Scheduler to co-schedule mixed types of workloads such as batch, stateless and stateful jobs in a single cluster for better resource utilization.petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.arx
ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.kafka-connect-storage-cloud
Kafka Connect suite of connectors for Cloud storage (currently including Amazon S3)Love Open Source and this site? Check out how you can help us