• Stars
    star
    12
  • Rank 1,597,372 (Top 32 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Provide functionality to build statistical models to repair dirty tabular data in Spark

More Repositories

1

spark-tpcds-datagen

All the things about TPC-DS in Apache Spark
Scala
104
star
2

spark-sql-flow-plugin

Visualize column-level data lineage in Spark SQL
Scala
86
star
3

lljvm-translator

A lightweight library to inject LLVM bitcode into JVMs
C++
81
star
4

spark-sql-server

Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
Scala
34
star
5

integer_encoding_library

An encoder/decoder collection for a sequence of integers
C++
32
star
6

hivemall-spark

A Hivemall wrapper for Spark
Scala
31
star
7

datasketches-spark

Data Sketches for Apache Spark
Scala
21
star
8

vpacker

A simple integer compression library for C/C++/Java
C
14
star
9

dbitv

A x86/64-optimized rank/select dictionary for dense bit-arrays
C++
12
star
10

spark-kinesis-sql-asl

Amazon Kinesis Source for Structured Streaming
Scala
12
star
11

spark-query-log-plugin

A simple toolkit to analyze Spark query logs
Scala
10
star
12

fuzz-testing-for-spark

[WIP] Run SQL-aware fuzz tests for the Catalyst optimizer in Apache Spark
C++
6
star
13

lockbench

A benchmark for a variety of spin-lock implementation using x86 primitives
C
4
star
14

bitonic_sort

A hardware-conscious and hy-speed sorting code
Shell
4
star
15

spark-graphx-pregel-personalized-pagerank

Personalized PageRank on Pregel/GraphX
Scala
4
star
16

mlflow-example

An example code for MLflow
Python
3
star
17

spark-executor-dict-plugin

Fast Read-only Data Dictionary Attached to Each Spark Executor
Scala
3
star
18

jupyterlab-dockerfile

A docker file for JupyterLab including pyspark
Python
2
star
19

membench

A micro-benchmark for memory stuffs such as caches & TLB
C++
2
star
20

lljvm-example

An example code of lljvm-translator
Shell
2
star
21

jvmci-test

A toy box to test JVMCI in JDK11
C++
1
star