• This repository has been archived on 24/Apr/2023
  • Stars
    star
    338
  • Rank 124,931 (Top 3 %)
  • Language
    Clojure
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark

⚠️ Cook Scheduler Development Has Ceased

After seven years of developing Cook Scheduler we have made the decision to archive the project. Cook will remain available on GitHub in archive mode but no further development will occur.

When Cook was open sourced it solved difficult problems in on-premises, capacity-constrained data centers. Today, however, the embrace of the public cloud has changed the problems that need to be solved. This shift is also reflected in slowing community contribution to Cook and the emergence of many other open source projects in this space. Given this, it no longer makes sense for us to maintain Cook as an open source project.

We are thankful for the opportunity to have shared Cook with the community and grateful for your contributions. Two Sigma remains committed to supporting open source software. You can find out more about our other projects and contributions here: https://www.twosigma.com/open-source/.

Cook Scheduler

Welcome to Two Sigma's Cook Scheduler!

What is Cook?

  • Cook is a powerful batch scheduler, specifically designed to provide a great user experience when there are more jobs to run than your cluster has capacity for.
  • Cook is able to intelligently preempt jobs to ensure that no user ever needs to wait long to get quick answers, while simultaneously helping you to achieve 90%+ utilization for massive workloads.
  • Cook has been battle-hardened to automatically recover after dozens of classes of cluster failures.
  • Cook can act as a Spark scheduler, and it comes with a REST API, Java client, Python client, and CLI.

Core concepts is a good place to start to learn more.

Releases

Check the changelog for release info.

Subproject Summary

In this repository, you'll find several subprojects, each of which has its own documentation.

  • scheduler - This is the actual Mesos framework, Cook. It comes with a JSON REST API.
  • jobclient - This includes the Java and Python APIs for Cook, both of which use the REST API under the hood.
  • spark - This contains the patch to Spark to enable Cook as a backend.

Please visit the scheduler subproject first to get started.

Quickstart

Using Google Kubernetes Engine (GKE)

The quickest way to get Cook running locally against GKE is with Vagrant.

  1. Install Vagrant
  2. Install Virtualbox
  3. Clone down this repo
  4. Run GCP_PROJECT_NAME=<gcp_project_name> PGPASSWORD=<random_string> vagrant up --provider=virtualbox to create the dev environment
  5. Run vagrant ssh to ssh into the dev environment

In your Vagrant dev environment

  1. Run gcloud auth login to login to Google cloud
  2. Run bin/make-gke-test-clusters to create GKE clusters
  3. Run bin/start-datomic.sh to start Datomic (Cook database) (Wait until "System started datomicπŸ†“//0.0.0.0:4334/, storing data in: data")
  4. Run lein exec -p datomic/data/seed_k8s_pools.clj $COOK_DATOMIC_URI to seed some Cook pools in the database
  5. Run bin/run-local-kubernetes.sh to start the Cook scheduler
  6. Cook should now be listening locally on port 12321

To test a simple job submission:

  1. Run cs submit --pool k8s-alpha --cpu 0.5 --mem 32 --docker-image gcr.io/google-containers/alpine-with-bash:1.0 ls to submit a simple job
  2. Run cs show <job_uuid> to show the status of your job (it should eventually show Success)

To run automated tests:

  1. Run lein test :all-but-benchmark to run unit tests
  2. Run cd ../integration && pytest -m 'not cli' to run integration tests
  3. Run cd ../integration && pytest tests/cook/test_basic.py -k test_basic_submit -n 0 -s to run a particular integration test

Using Mesos

The quickest way to get Mesos and Cook running locally is with docker and minimesos.

  1. Install docker
  2. Clone down this repo
  3. cd scheduler
  4. Run bin/build-docker-image.sh to build the Cook scheduler image
  5. Run ../travis/minimesos up to start Mesos and ZooKeeper using minimesos
  6. Run bin/run-docker.sh to start the Cook scheduler
  7. Cook should now be listening locally on port 12321

Contributing

In order to accept your code contributions, please fill out the appropriate Contributor License Agreement in the cla folder and submit it to [email protected].

Disclaimer

Apache Mesos is a trademark of The Apache Software Foundation. The Apache Software Foundation is not affiliated, endorsed, connected, sponsored or otherwise associated in any way to Two Sigma, Cook, or this website in any manner.

Β© Two Sigma Open Source, LLC

More Repositories

1

beakerx

Beaker Extensions for Jupyter Notebook
Jupyter Notebook
2,799
star
2

flint

A Time Series Library for Apache Spark
Scala
999
star
3

git-meta

Repository for the git-meta project -- build your own monorepo using Git submodules
JavaScript
219
star
4

satellite

Satellite monitors, alerts on, and self-heals your Mesos cluster.
Clojure
143
star
5

fastfreeze

Turn-key solution to checkpoint/restore applications running in Linux containers
Rust
118
star
6

marbles

Read better test failures.
Python
116
star
7

waiter

Runs, manages, and autoscales web services on Mesos and Kubernetes
Clojure
84
star
8

ngrid

It's "less" for data!
Python
76
star
9

nsncd

nscd-compatible daemon that proxies lookups, without caching
Rust
57
star
10

uberjob

uberjob is a Python package for building and running call graphs.
Python
28
star
11

riemann-jmx

A reliable JMX connector for Riemann
Clojure
25
star
12

libvirtcpuid

libvirtcpuid provides transparent CPUID virtualization, all in userspace.
C
25
star
13

tensu

Tensu is a TUI (text user interface) based program for interacting Sensu Go's monitoring pipeline and backend API.
Python
22
star
14

blinky

C++
22
star
15

ts_isolate

C
17
star
16

libvirttime

libvirttime provides transparent time virtualization, all in userspace.
Rust
15
star
17

postgresql-contrib

PLpgSQL
12
star
18

envoy-viz

Go
11
star
19

goll-e

Graph Object Language and Layout Editor
JavaScript
11
star
20

beaker-notebook-archive

Archive of Beaker Notebook
Java
11
star
21

gcsthin

Rust
11
star
22

beakerx_tabledisplay

TypeScript
10
star
23

mbeat-core

Command-line utilities for testing multicast network.
C
10
star
24

docker-repo-auth-demo

Demonstration of auth with the Docker open source registry
Python
9
star
25

set_ns_last_pid

The semantics of writing to /proc/sys/kernel/ns_last_pid without privileges
C
9
star
26

iqueue

C
9
star
27

debootwrap

Create a Debian chroot unprivileged, with the help of bubblewrap and debootstrap.
Shell
6
star
28

nxv

Render NetworkX graphs using GraphViz
Python
5
star
29

OpenJDK

Two Sigma Open Source fork of OpenJDK for contributions to upstream, based on AdoptOpenJDK
Java
5
star
30

docker-sideways

Go
4
star
31

beakerx_widgets

TypeScript
4
star
32

pagerduty-supervisor

Integrates PagerDuty with SupervisorD
Python
4
star
33

beakerx_kernel_base

Base repository for BeakerX JVM based kernels
Java
3
star
34

verilator_support

Support scripts for verilator
Perl
3
star
35

beakerx_kernel_scala

Java
3
star
36

memento

Framework and lightweight set of standards that encourage discipline in the way data is incrementally transformed through code
Python
3
star
37

beakerx_kernel_groovy

Java
2
star
38

beakerx_kernel_sql

Java
2
star
39

gsskrb5

C
2
star
40

beakerx_kernel_java

Java
1
star
41

relexec

A program to enable relative shebangs in scripts
Python
1
star
42

beakerx_base

Python
1
star
43

beakerx_tests

Jupyter Notebook
1
star
44

beakerx_kernel_autotranslation

A repository for BeakerX Autotranslation components.
Python
1
star