CPU and GPU-accelerated Machine Learning Library
Spark library for easy MongoDB access
The missing MatPlotLib for Scala + Spark
A Scala kernel for Jupyter
Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Breeze is a numerical processing library for Scala.
Lightweight real-time big data streaming engine over Akka
Scala library for accessing various file, batch systems, job schedulers and grid middlewares.
Cloud-native genomic dataframes and batch computing
A simplified, lightweight ETL Framework based on Apache Spark
Spark DataFrames for earth observation data
A Scala API for Cascading
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
A Scala productivity framework for Hadoop.
Scala DSL on top of Oozie XML
Apache Spark - A unified analytics engine for large-scale data processing
Deploy Spark cluster in an easy way.
Executable Apache Spark Tools: Format Converter & SQL Processor
Basic framework utilities to quickly start writing production ready Apache Spark applications
Spark package to "plug" holes in data using SQL based rules β‘οΈ π
Real Time Analytics and Data Pipelines based on Spark Streaming
Streaming MapReduce with Scalding and Storm