• Stars
    star
    114
  • Rank 308,031 (Top 7 %)
  • Language
    Scala
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An embedded job scheduler.

Cuttle

An embedded job scheduler/executor for your Scala projects.

Concepts

Embedded means that cuttle is not an hosted service where you submit jobs to schedule/execute. Instead it is a Scala library that you embed into your own project to schedule and execute a DAG of jobs. The DAG and the jobs definitions are all written using the cuttle Scala API. The scheduling mechanism can be customized.

Jobs

A cuttle project is composed of many Jobs to execute.

Each job is defined by a set of metadata (such as the job identifier, name, etc.) and most importantly by a side effect function. This function handles the actual job execution, and its Scala signature is something like Context => Future[Completed] (which can be read as “execute the job for this input parameter and signal me the completion or failure with the returned Future value”).

The side effect function is opaque for cuttle, so it can't exactly know what will happen there (it can be any Scala code), but it assumes that the function:

  • Is asynchronous and non-blocking. It will immediately return a Future value that will be resolved upon execution success or failure.
  • Produces a side effect, so calling it actually will do some job and mutate some state somewhere.
  • Is idempotent, so calling it twice for the same input (context) won't be a problem.

Being idempotent is important because cuttle is an at least once executor. It will ensure that the job has been successfully executed at least once for a given input. In case of failure or crash it may have to execute it again and so it may happen that the side effect function will succeed more that once. It would be very brittle otherwise.

Scheduler

Executions of these jobs are planned by a Scheduler. Actually a job is always configured for a specific Scheduling and this is the type S you usually see in the Scala API. This scheduling information allows to provide more information to the scheduler about how the jobs must be triggered.

The scheduler gets the list of job (a scheduling specific Workload) as input and starts producing Executions. A basic scheduler can for example run a single execution for each job.

But of course more sophisticated schedulers can exist. Cuttle comes with a TimeSeries scheduler that executes a whole job workflow (a Directed Acyclic Graph of jobs) across time partitions. For example it can execute the graph hourly or daily. And it can even execute it across different time partitions such as a daily job depending on several executions of an hourly job.

The input context given to the side effect function depends of the scheduling. For example the input for a time series job is TimeSeriesContext and contains basically the start and end time for the partition for which the job is being executed.

Executor

The cuttle Executor handles the job executions triggered by the scheduler. When it has to execute a job for a given SchedulingContext it creates and execution, and then invoke the job's side effect function for it.

As soon as the execution starts, it is in the Started state. Started executions are displayed in the UI with a special status indicating if they are Running or Waiting. This actually indicates if the Scala code being currently executed is waiting for some external resources (the permit to fork an external process for example). But as soon as the execution is Started it means that the Scala lambda behind is running!

An execution can also be in the Stuck state. It happens when a given execution keeps failing: Let's say the scheduler wants to execute the job a for the X context. So it asks the executor which eventually executes the job side effect. If the function fails, the returned Future fails and the scheduler is notified of that failure. Because the scheduler really wants that job to be executed for the X context, it will submit it again. When the executor sees this new execution coming back after a failure it will apply a RetryStrategy. The default strategy is to use an exponential backoff to delay the retry of these failing executions. While being in this state Stuck executions are displayed in a special tab of the UI and it means that it is something you should take care of.

An execution can also be in Paused state. It happens when the job itself has been paused. Note that this is a temporary state; eventually the job has to be unpaused and so the executions will be triggered, otherwise more and more paused executions will stack forever.

Finally executions can be Finished either with a Success or Failed state. You can retrieve these old executions in the log for finished executions.

Execution Platforms

The way to manage external resources in cuttle is via ExecutionPlatform. An execution platforms defines the contract about how to use the resources. They are configured at project bootstrap and usually set limits on how resources will be used (for example to only allow 10 external processes to be forked at the same time).

This is necessary because potentially thousands of concurrent executions can happen in cuttle. These executions will fight for shared resources via these execution platforms. Usually a platform will use a priority queue to prioritize access to these shared resources, and the priority is based on the SchedulingContext of each execution (so the executions with highest priority get access to the shared resources first). For example the TimeSeriesContext defines its Ordering in such way that oldest partitions take priority.

Time series scheduling

The built-in TimeSeriesScheduler executes a workflow of jobs for the time partitions defined in a calendar. Each job defines how it maps to the calendar (for example Hourly or Daily CEST), and the scheduler ensures that at least one execution is created and successfully run for each defined (Job, Period) pair.

In case of failure the time series scheduler will submit the execution again and again until the partition is successfully completed (depending of the retry strategy you have configured the delay between retries will vary).

It is also possible to Backfill successfully completed past partitions, meaning that we want to recompute them anyway. The whole graph or only a part of the graph can be backfilled depending of what you need. A priority can be given to the backfill so the executions triggered by this backfill can be assigned more or less priority than the day to day workload.

Documentation

The API documentation is the main reference for Scala programmers.

For a project example, you can also follow these hands-on introductions:

To run the example application, checkout the repository, launch the sbt console in the project (you will need yarn as well to compile the UI part), and run the example HelloTimeSeries command.

Usage

The library is cross-built for Scala 2.11 and Scala 2.12.

The core module to use is "com.criteo.cuttle" %% "cuttle" % "0.12.4".

You also need to fetch one Scheduler implementation:

  • TimeSeries: "com.criteo.cuttle" %% "timeseries" % "0.12.4"".
  • Cron: "com.criteo.cuttle" %% "cron" % "0.12.4"".

License

This project is licensed under the Apache 2.0 license.

Copyright

Copyright © Criteo, 2021.

More Repositories

1

autofaiss

Automatically create Faiss knn indices with the most optimal similarity search parameters.
Python
811
star
2

cassandra_exporter

Apache Cassandra® metrics exporter for Prometheus
Java
169
star
3

biggraphite

Simple Scalable Time Series Database
Python
130
star
4

babar

Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
Java
125
star
5

kerberos-docker

Run kerberos environment in docker containers
Shell
124
star
6

kafka-sharp

A C# Kafka driver
C#
110
star
7

lolhttp

An HTTP Server and Client library for Scala.
Scala
91
star
8

tf-yarn

Train TensorFlow models on YARN in just a few lines of code!
Python
86
star
9

consul-templaterb

consul-template-like with erb (ruby) template expressiveness
Ruby
77
star
10

Spark-RSVD

Randomized SVD of large sparse matrices on Spark
Scala
77
star
11

JVips

Java wrapper for libvips using JNI.
Java
72
star
12

deepr

The deepr module provide abstractions (layers, readers, prepro, metrics, config) to help build tensorflow models on top of tf estimators
Python
50
star
13

cluster-pack

A library on top of either pex or conda-pack to make your Python code easily available on a cluster
Python
45
star
14

findjars

Gradle plugin to debug classpath issues
Kotlin
45
star
15

py-consul

Python client for Consul (http://www.consul.io/)
Python
44
star
16

kafka-ganglia

Kafka Ganglia Metrics Reporter
Java
39
star
17

garmadon

Java event logs collector for hadoop and frameworks
Java
39
star
18

graphite-remote-adapter

Fully featured graphite remote adapter for Prometheus
Go
38
star
19

command-launcher

A command launcher 🚀 made with ❤️
Go
36
star
20

marathon_exporter

A Prometheus metrics exporter for the Marathon Mesos framework
Go
34
star
21

haproxy-spoe-auth

Plugin for authorizing users against LDAP
Go
33
star
22

netprobify

Network probing tool crafted for datacenters (but not only)
Python
31
star
23

vizsql

Scala and SQL happy together.
Scala
28
star
24

haproxy-spoe-go

An implementation of the SPOP protocol in Go. https://www.haproxy.org/download/2.0/doc/SPOE.txt
Go
28
star
25

CriteoDisplayCTR-TFOnSpark

Python
27
star
26

netcompare

Python
26
star
27

loop

enhance your web application development workflow
JavaScript
26
star
28

openapi-comparator

C#
25
star
29

fromconfig

A library to instantiate any Python object from configuration files.
Python
24
star
30

vertica-hyperloglog

C++
22
star
31

slab

An extensible Scala framework for creating monitoring dashboards.
Scala
22
star
32

consul-bench

A tool to bench Consul Clusters
Go
20
star
33

socco

A Scala compiler plugin to generate documentation from Scala source files.
Scala
20
star
34

hwbench

hwbench is a benchmark orchestration tool to automate the low-level testing of servers.
Python
20
star
35

mesos-term

Web terminal and sandbox explorer for your mesos containers
TypeScript
19
star
36

memcache-driver

Criteo's .NET MemCache driver
C#
16
star
37

NinjaTurtlesMutation

C#
16
star
38

defcon

DefCon - Status page and API for production status
Python
16
star
39

vagrant-winrm

Vagrant 1.6+ plugin extending WinRM communication features
Ruby
16
star
40

criteo-python-marketing-sdk

Official Python SDK to access the Criteo Marketing API
Python
15
star
41

mesos-external-container-logger

Mesos container logger module for logging to processes, backported from MESOS-6003
C++
13
star
42

android-publisher-sdk

Criteo Publisher SDK for Android
Java
12
star
43

lobster

Simple loop job runner
Ruby
12
star
44

ocserv-exporter

ocserv exporter for Prometheus
Go
11
star
45

berilia

Create hadoop cluster in aws ec2 for development
Scala
11
star
46

mlflow-yarn

Backend implementation for running MLFlow projects on Hadoop/YARN.
Python
10
star
47

openpass

TypeScript
10
star
48

ios-publisher-sdk

Criteo Publisher SDK for iOS
Objective-C
10
star
49

traffic-mirroring

Go
8
star
50

ipam-client

Python ipam-client library
Python
7
star
51

eslint-plugin-criteo

JavaScript
7
star
52

tableau-parser

Scala
7
star
53

gourde

Flask sugar for Python microservices
Python
7
star
54

criteo-java-marketing-sdk

Official Java SDK to access the Criteo Marketing API
Java
7
star
55

metrics-net

Archived: Capturing CLR and application-level metrics. So you know what's going on.
C#
6
star
56

casspoke

Prometheus probe exporter for Cassandra latency and availability
Java
6
star
57

newman-server

A simple webserver to run Postman collections using the newman engine
TypeScript
6
star
58

mewpoke

Memcached / couchbase probe
Java
6
star
59

je-code-crazy-filters

Python
6
star
60

http-proxy-exporter

Expose proxy performance statistics in a Prometheus-friendly way.
Go
5
star
61

kitchen-transport-speedy

Speed up kitchen file transfer using archives
Ruby
5
star
62

AFK

5
star
63

vertica-datasketch

C++
5
star
64

django-memcached-consul

Used consul discovered memcached servers
Python
4
star
65

skydive-visualizer

Go
4
star
66

log4j-jndi-jar-detector

Application trying to detect processes vulnerable to log4j JNDI exploit
Go
4
star
67

criteo-api-python-sdk

Python
4
star
68

RabbitMQHare

High-level RabbitMQ C# client
C#
4
star
69

hardware-manifesto

Criteo's hardware operating principles manifest
4
star
70

automerge-plugin

Gerrit plugin to automatically merge reviews
Java
4
star
71

cassback

This project aims to backup Cassandra SSTables and store them into HDFS
Ruby
4
star
72

vertica-hll-druid

C++
3
star
73

fromconfig-mlflow

A fromconfig Launcher for MlFlow
Python
3
star
74

hive-client

A Pure Scala/Thrift Hive Client
Thrift
3
star
75

android-events-sdk

Java
3
star
76

rundeck-dsl

Groovy
3
star
77

tableau-maven-plugin

Java
3
star
78

ml-hadoop-experiment

Python
3
star
79

android-publisher-sdk-examples

Java
3
star
80

vault-auth-plugin-chef

Go
3
star
81

mesos-command-modules

Mesos modules running external commands
C++
3
star
82

sonic-saltstack

Saltstack modules for SONiC
Python
3
star
83

scala-schemas

use scala classes as schema definition across different systems
Scala
3
star
84

tf-collective-all-reduce

Lightweight framework for distributed TensorFlow training based on dmlc/rabit
Python
3
star
85

criteo-marketing-sdk-generator

A Gradle project to generate custom SDKs for Criteo's marketing API
Mustache
3
star
86

s3-probe

Go
3
star
87

blackbox-prober

Go
3
star
88

netbox-network-cmdb

Python
3
star
89

kitchen-vagrant_winrm

A test-kitchen driver using vagrant-winrm
Ruby
2
star
90

graphite-dashboard-api

Graphite Dashboard API
Ruby
2
star
91

nrpe_exporter

Go
2
star
92

criteo-dotnet-blog

C#
2
star
93

criteo-java-marketing-transition-sdk

Java
2
star
94

pgwrr

Python
2
star
95

criteo-python-marketing-transition-sdk

Python
2
star
96

privacy

2
star
97

knife-ssh-agent

Authenticate to a chef server using a SSH agent
Ruby
2
star
98

carbonate-utils

Utilities for carbonate - resync whisper easilly
Python
2
star
99

ios-events-sdk

Objective-C
2
star
100

gtm-criteo-useridentification

Smarty
2
star