• Stars
    star
    238
  • Rank 168,834 (Top 4 %)
  • Language
    Scala
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Benchmark Suite for Apache Spark

spark-bench

Benchmark Suite for Apache Spark

GitHub issues Build Status codecov

READ OUR DOCS

The documentation for Spark-Bench is all in our shiny new docs site: https://codait.github.io/spark-bench/

Versions And Compatibility

Spark Version

Spark-Bench is currently compiled against the Spark 2.1.1 jars and should work with Spark 2.x. If you experience compatibility issues between Spark-Bench and any 2.x version of Spark, please let us know!

Scala Version

Spark-Bench is written using Scala 2.11.8. It is incompatible with Spark versions running Scala 2.10.x

Installation

Follow the Quickstart guide from our docs site. For more details, see the Installation page.

Legacy Version

spark-bench has recently gone through an extensive rewrite. While we think you'll like the new capabilities, it is not quite feature complete with the previous version of spark-bench. Many of the workloads that were available in the legacy have not yet been ported over, but they will be!

In the meantime, if you would like to see the old version of spark-bench, it's preserved in the legacy branch.

You can also grab the last official release of the legacy version from here.

More Repositories

1

text-extensions-for-pandas

Natural language processing support for Pandas dataframes.
Jupyter Notebook
217
star
2

deep-histopath

A deep learning approach to predicting breast tumor proliferation scores for the TUPAC16 challenge
Jupyter Notebook
203
star
3

stocator

Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
Java
111
star
4

covid-notebooks

Jupyter notebooks that analyze COVID-19 time series data
Jupyter Notebook
104
star
5

max-central-repo

Central Repository of Model Asset Exchange project. This repository contains information about the available models, current project status, contribution guidelines and supporting assets.
78
star
6

aardpfark

A library for exporting Spark ML models and pipelines to PFA
Scala
54
star
7

presentations

Talks & Workshops by the CODAIT team
Jupyter Notebook
52
star
8

r4ml

Scalable R for Machine Learning
R
42
star
9

spark-ref-architecture

Reference Architectures for Apache Spark
Scala
38
star
10

graph_def_editor

GraphDef Editor: A port of the TensorFlow contrib.graph_editor package that operates over serialized graphs
Python
31
star
11

magicat

🧙😺 magicat - Deep learning magic.. with the convenience of cat!
JavaScript
26
star
12

node-red-contrib-model-asset-exchange

Node-RED nodes for the Model Asset Exchange on IBM Developer
JavaScript
20
star
13

max-tfjs-models

Pre-trained TensorFlow.js models for the Model Asset Exchange
JavaScript
18
star
14

pardata

Python
17
star
15

nlp-editor

Visual Editor for Natural Language Processing pipelines
JavaScript
15
star
16

flight-delay-notebooks

Analyzing flight delay and weather data using Elyra, IBM Data Asset Exchange, Kubeflow Pipelines and KFServing
Jupyter Notebook
15
star
17

spark-db2

DB2/DashDB Connector for Apache Spark
Scala
14
star
18

redrock

RedRock - Mobile Application prototype using Apache Spark, Twitter and Elasticsearch
Scala
14
star
19

spark-netezza

Netezza Connector for Apache Spark
Scala
13
star
20

Identifying-Incorrect-Labels-In-CoNLL-2003

Research into identifying and correcting incorrect labels in the CoNLL-2003 corpus.
Jupyter Notebook
12
star
21

max-vis

Image annotation library and command-line utility for MAX image models
JavaScript
9
star
22

fae-tfjs

JavaScript
9
star
23

WELCOME-TO-CODAIT

Welcome to the Center for Open-Source Data & AI Technologies (CODAIT) organization on GitHub! Learn more about our projects ...
8
star
24

spark-tracing

A flexible instrumentation package for visualizing the internal operation of Apache Spark and related tools
Scala
8
star
25

redrock-v2

RedRock v2 Repository
Jupyter Notebook
8
star
26

max-node-red-docker-image

Demo Docker image for the Model Asset Exchange Node-RED module
Dockerfile
8
star
27

max-workshop-oscon-2019

7
star
28

notebook-exporter

One Click deployment of Notebooks - Bringing Notebooks to Production
Scala
6
star
29

redrock-ios

RedRock - Mobile Application prototype
JavaScript
4
star
30

max-base

This repo has been moved
Python
4
star
31

max-status

Current status of the Model Asset Exchange ecosystem
4
star
32

project-codenet-notebooks

Jupyter Notebook
3
star
33

MAX-Web-App-skeleton

A fully functioning skeleton for MAX model web apps
JavaScript
3
star
34

development-guidelines

Development Guidelines and related resources for IBM Spark Technology Center
3
star
35

codait.github.io

CODAIT Homepage
HTML
3
star
36

dax-schemata

Python
2
star
37

redrock-v2-ios

RedRock v2 iPad Application
JavaScript
2
star
38

max-pytorch-mnist

Jupyter Notebook
2
star
39

teach-nao-robot-a-new-skill

Teach your NAO robot a new skill using deep learning microservices
2
star
40

max-fashion-mnist-tutorial-app

Python
1
star
41

MAX-cloud-deployment-cheatsheets

Work in progress
1
star
42

ddc-data-and-ai-2021-automate-using-open-source

Jupyter Notebook
1
star
43

exchange-metadata-converter

Basic conversion utility for YAML-based metadata descriptors
Python
1
star
44

streaming-integration-sample

Scala
1
star
45

covid-trusted-ai-pipeline

Jupyter Notebook
1
star