Discover United States's Leading Open Source Projects: Explore top-notch open source initiatives hailing from the vibrant tech community of United States.
twitter/the-algorithm
Source code for Twitter's Recommendation Algorithmprisma/prisma1
πΎ Database Tools incl. ORM, Migrations and Admin UI (Postgres, MySQL & MongoDB) [deprecated]twitter/finagle
A fault tolerant, protocol-agnostic RPC systemtwitter-archive/snowflake
Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.enso-org/enso
Hybrid visual and textual functional programming.microsoft/SynapseML
Simple and Distributed Machine Learningairbnb/aerosolve
A machine learning package built for humans.mesos/chronos
Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedulesmesosphere/marathon
Deploy and manage containers (including Docker) on top of Apache Mesos at scale.twitter-archive/diffy
Find potential bugs in your services with Diffytwitter/scalding
A Scala API for Cascadingtwitter-archive/flockdb
A distributed, fault-tolerant graph databaseNetflix/atlas
In-memory dimensional time series database.awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.twitter-archive/kestrel
simple, distributed message queue system (inactive)twitter/util
Wonderful reusable code from Twitterdatabricks/Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repositorytwitter/algebird
Abstract Algebra for Scalatwitter/finatra
Fast, testable, Scala services built on TwitterServer and Finagletwitter-archive/gizzard
[Archived] A flexible sharding framework for creating eventually-consistent distributed datastorestwitter/summingbird
Streaming MapReduce with Scalding and Stormmetarank/metarank
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank enginefeathr-ai/feathr
Feathr β A scalable, unified data and AI engineering platform for enterprisesangria-graphql/sangria
Scala GraphQL implementationriscv-boom/riscv-boom
SonicBOOM: The Berkeley Out-of-Order Machineucb-bar/chipyard
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and moreThoughtWorksInc/Binding.scala
Reactive data-binding for Scalatwitter/twitter-server
Twitter-Server defines a template from which services at Twitter are builtGravityLabs/goose
Html Content / Article Extractor in Scala - open sourced from Gravity Labssryza/aas
Code to accompany Advanced Analytics with Spark from O'Reilly Mediaholdenk/spark-testing-base
Base classes to use when writing tests with Sparkcombust/mleap
MLeap: Deploy ML Pipelines to Productionpathikrit/better-files
Simple, safe and intuitive Scala I/Ovkostyukov/scalacaster
Purely Functional Algorithms and Data Structures in Scalapaypal/squbs
Akka Streams & Akka HTTP for Large-Scale Production Deploymentsmauricio/postgresql-async
Async, Netty based, database drivers for PostgreSQL and MySQL written in Scalalocationtech/geomesa
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.mesos/spark
Lightning-fast cluster computing in Java, Scala and Python.twitter-archive/iago
A load generator, built for engineerslocationtech/geotrellis
GeoTrellis is a geographic data processing engine for high performance applications.twitter/rsc
Experimental Scala compiler focused on compilation speedsryza/spark-timeseries
A library for time series analysis on Apache Sparkwavesplatform/Waves
βοΈ Reference Waves Blockchain Node (client) implementation on Scaladatabricks/LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]tumblr/colossus
I/O and Microservice library for Scalapauljamescleary/scala-pet-store
An implementation of the java pet store using FP techniques in scalasifive/freedom
Source files for SiFive's Freedom platformsdatabricks/spark-csv
CSV Data Source for Apache Spark 1.xtwitter/cassovary
Cassovary is a simple big graph processing library for the JVMTIBCOSoftware/snappydata
Project SnappyData - memory optimized analytics database, based on Apache Sparkβ’ and Apache Geodeβ’. Stream, Transact, Analyze, Predict in one clusterlensesio/stream-reactor
A collection of open source Apache 2.0 Kafka Connector maintained by Lenses.io.twosigma/flint
A Time Series Library for Apache Sparkcloudera/livy
Livy is an open source REST interface for interacting with Apache Spark from anywhereamplab/shark
Development in Shark has been ended.broadinstitute/cromwell
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environmentsh2oai/sparkling-water
Sparkling Water provides H2O functionality inside Spark clustereaplatanios/tensorflow_scala
TensorFlow API for the Scala Programming Languagewzhe06/SparkCTR
CTR prediction model based on spark(LR, GBDT, DNN)twitter/twitter-korean-text
Korean tokenizerprecog/matryoshka
Generalized recursion schemes and traversals for Scala.NVIDIA/spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUsucb-bar/gemmini
Berkeley's Spatial Array Generatortwitter/scrooge
A Thrift parser/generatortwitter-archive/ostrich
A stats collector & reporter for Scala servers (deprecated)ThoughtWorksInc/DeepLearning.scala
A simple library for creating complex neural networksdatabricks/tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache SparkMrPowers/spark-daria
Essential Spark extensions and helper methods β¨π²gregdurrett/berkeley-doc-summarizer
The Berkeley Document Summarizer is a learning-based, single-document summarization system that extracts source document content, exploits syntactic information to compress it, and uses coreference constraints to ensure clarity.jdegoes/blueeyes
A lightweight Web 3.0 framework for Scala, featuring a purely asynchronous architecture, extremely high-performance, massive scalability, high usability, and a functional, composable design.p2t2/figaro
Figaro Programming Language and Core Librariesairbnb/chronon
Chronon is a data platform for serving for AI/ML applications.ezhulenev/orderbook-dynamics
Modeling high-frequency limit order book dynamics with support vector machinessksamuel/avro4s
Avro schema generation and serialization / deserialization for Scalarchain/rchain
Blockchain (smart contract) platform using CBC-Casper proof of stake + Rholang for concurrent execution.ucb-bar/riscv-sodor
educational microarchitectures for risc-v isaactionml/universal-recommender
Highly configurable recommender based on PredictionIO and Mahout's Correlated Cross-Occurrence algorithmsameeragarwal/blinkdb
BlinkDB: Sub-Second Approximate Queries on Very Large Data.twitter/bijection
Reversible conversions between typesdatabricks/reference-apps
Spark reference applicationsdeanwampler/programming-scala-book-code-examples
The code examples used in Programming Scala, 2nd and 3rd Editions (O'Reilly)ucb-bar/chisel-tutorial
chisel tutorial exercises and answersshadaj/slinky
Write Scala.js React apps just like you would in ES6open-korean-text/open-korean-text
Open Korean Text Processor - An Open-source Korean Text Processortwitter/chill
Scala extensions for the Kryo serialization libraryamplab/SparkNet
Distributed Neural Networks for Sparkdatabricks/spark-redshift
Redshift data source for Apache Sparkallenai/pdffigures2
Given a scholarly PDF, extract figures, tables, captions, and section titles.tpolecat/tut
doc/tutorial generator for scalatumblr/collins
groovy kind of loveenragedginger/akka-quartz-scheduler
Quartz Extension and utilities for cron-style scheduling in AkkaNetflix/edda
AWS API Read Cachejsuereth/scala-arm
This project aims to be the Scala Incubator project for Automatic-Resource-Management in the scala libraryhyperledger-labs/Scorex
Scorex 2.0 Coredatabricks/spark-sql-perf
ucb-bar/riscv-mini
Simple RISC-V 3-stage Pipeline in Chiseldatabricks/spark-avro
Avro Data Source for Apache SparkStratio/sparta
Real Time Analytics and Data Pipelines based on Spark Streamingoutr/scribe
The fastest logging library in the world. Built from scratch in Scala and programmatically configurable.guardrail-dev/guardrail
Principled code generation from OpenAPI specificationsorbeon/orbeon-forms
Orbeon Forms is an open source web forms solution. It includes an XForms engine, the Form Builder web-based form editor, and the Form Runner runtime.Love Open Source and this site? Check out how you can help us