• Stars
    star
    1,305
  • Rank 36,065 (Top 0.8 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Workload Automation System

Digdag

CircleCI

CI

Documentation

Please check digdag.io and docs.digdag.io for installation & user manual.

REST API document is available at docs.digdag.io/api.

Release Notes

The list of release note is here.

Development

Prerequirements

  • JDK 8
  • Node.js 12.x

Installing Node.js using nodebrew:

$ curl -L git.io/nodebrew | perl - setup
$ echo 'export PATH=$HOME/.nodebrew/current/bin:$PATH' >> ~/.bashrc
$ source ~/.bashrc
$ nodebrew install-binary v12.x
$ nodebrew use v12.x

Installing Node.js using Homebrew on Mac OS X:

$ brew install node
  • Python 3
    • sphinx
    • sphinx_rtd_theme
    • recommonmark

Running tests

$ ./gradlew check

Test coverage report is generated at didgag-*/build/reports/jacoco/test/html/index.html. Findbugs report is generated at digdag-*/build/reports/findbugs/main.html.

$ CI_ACCEPTANCE_TEST=true ./gradlew digdag-tests:test --info --tests acceptance.BuiltInVariablesIT

To execute tests in digdag-tests subproject locally, tests option that is provided by Gradle is useful. Environment variable CI_ACCEPTANCE_TEST=true is needed to execute digdag-tests.

Testing with PostgreSQL

Test uses in-memory H2 database by default. To use PostgreSQL, set following environment variables:

$ export DIGDAG_TEST_POSTGRESQL="$(cat config/test_postgresql.properties)"

Building CLI executables

$ ./gradlew cli
$ ./gradlew cli -PwithoutUi  # build without integrated UI

(If the command fails during building UI due to errors from node command, you can try to add -PwithoutUi argument to exclude the UI from the package).

It makes an executable in pkg/, e.g. pkg/digdag-$VERSION.jar.

Develop digdag-ui

Node.js development server is useful because it reloads changes of digdag-ui source code automatically.

First, put following lines to ~/.config/digdag/config and start digdag server:

server.http.headers.access-control-allow-origin = http://localhost:9000
server.http.headers.access-control-allow-headers = origin, content-type, accept, authorization, x-td-account-override, x-xsrf-token, cookie
server.http.headers.access-control-allow-credentials = true
server.http.headers.access-control-allow-methods = GET, POST, PUT, DELETE, OPTIONS, HEAD
server.http.headers.access-control-max-age = 1209600

Then, start digdag-ui development server:

$ cd digdag-ui/
$ npm install
$ npm run dev    # starts dev server on http://localhost:9000/

Updating REST API document

Run this command to update REST API document file at digdag-docs/src/api/swagger.yaml.

./gradlew swaggerYaml  # dump swagger.yaml file

Use --enable-swagger option to check the current Digdag REST API.

$ ./gradlew cli
$ ./pkg/digdag-<current version>.jar server --memory --enable-swagger # Run server with --enable-swagger option

$ docker run -dp 8080:8080 swaggerapi/swagger-ui # Run Swagger-UI on different console
$ open http://localhost:8080/?url=http://localhost:65432/api/swagger.json # Open api/swagger.json on Swagger-UI

Updating documents

Documents are in digdag-docs/src directory. They're built using Sphinx.

Website is hosted on www.digdag.io using Github Pages. Pages are built using deployment step of circle.yml and automatically pushed to gh-pages branch of digdag-docs repository.

To build the pages and check them locally, follow this instruction.

Create a virtual environment of Python and install dependent Python libraries including Sphinx.

$ python3 -m venv .venv
$ source .venv/bin/activate
(.venv)$ pip install -r digdag-docs/requirements.txt -c digdag-docs/constraints.txt

After installation of Python libraries, You can build with running the following command:

(.venv)$ ./gradlew site

This might not always update all necessary files (Sphinx doesn't manage update dependencies well). In this case, run ./gradlew clean first.

It builds index.html at digdag-docs/build/html/index.html.

Development on IDEs

IntelliJ IDEA

Digdag is using a Java annotation processor org.immutables:value. The combination of Java annotation processing and Gradle on IntelliJ IDEA sometimes introduces some troubles. In Digdag's case, you may run into some compile errors like cannot find symbol: class ImmutableRestWorkflowDefinitionCollection. So we'd recommend the followings to avoid those compile errors if you want to develop Digdag one the IDE.

  1. There's an important configuration option to be enabled to fully have IntelliJ be fully integrated with an existing gradle build configuration: Delegate IDE build/run actions to gradle needs to be enabled.

Releasing a new version

This is for committers only.

Prerequisite: Sonatype OSSRH

You need an account in Sonatype OSSRH, and configure it in your ~/.gradle/gradle.properties.

ossrhUsername=(your Sonatype OSSRH username) ossrhPassword=(your Sonatype OSSRH password)

Prerequisite: PGP signatures

You need your PGP signatures to release artifacts into Maven Central, and configure Gradle to use your key to sign. Configure it in your ~/.gradle/gradle.properties.

signing.gnupg.executable=gpg
signing.gnupg.useLegacyGpg=false
signing.gnupg.keyName=(the last 8 symbols of your keyId)
signing.gnupg.passphrase=(the passphrase used to protect your private key)

Release procedure

As mentioned in the prerequirements, we need to build with JDK 8 in this procedure.

  1. run git pull upstream master --tags.
  2. run ./gradlew setVersion -Pto=<version> command.
  3. write release notes to releases/release-<version>.rst file. It must include at least version (the first line) and release date (the last line).
  4. run ./gradlew clean cli site check releaseCheck.
  5. make a release branch. git checkout -b release_v<version> and commit.
  6. push the release branch to origin and create a PR.
  7. after the PR is merged to master, checkout master and pull latest upstream/master.
  8. run ./gradlew clean cli site check releaseCheck again.
  9. if it succeeded, run ./gradlew release.
  10. create a tag git tag -a v<version> and push git push upstream v<version>
  11. create a release in GitHub releases.
  12. upload pkg/digdag-<version>.jar to the release
  13. a few minutes later, run digdag selfupdate and confirm the version.

If major version is incremented, also update version = and release = at digdag-docs/src/conf.py.

If you are expert, skip 5. to 7. and directly update master branch.

Post-process of new release

You also need following steps after new version has been released.

  1. create next snapshot version, run ./gradlew setVersion -Pto=<next-version>-SNAPSHOT.
  2. push to master.

Releasing a SNAPSHOT version

./gradlew releaseSnapshot

Note Snapshot release is not supported currently.

More Repositories

1

serverengine

A framework to implement robust multiprocess servers like Unicorn
Ruby
758
star
2

prestogres

PostgreSQL protocol gateway for Presto distributed SQL query engine
C
292
star
3

chef-td-agent

Chef Cookbook for td-agent (Treasure Agent or Fluentd)
Ruby
127
star
4

perfectqueue

Highly available distributed queue built on RDBMS
Ruby
124
star
5

td-agent

This repository is OBSOLETE, check gh/treasure-data/omnibus-td-agent
Shell
109
star
6

treasure-boxes

Treasure Boxes - pre-built pieces of code for developing, optimizing, and analyzing your data.
Python
109
star
7

perfectsched

Highly available distributed cron built on RDBMS
Ruby
97
star
8

omnibus-td-agent

td-agent (Fluentd) Packaging Scripts
Shell
82
star
9

trino-client-ruby

Trino/Presto client library for Ruby
Ruby
70
star
10

td-js-sdk

JavaScript SDK for Treasure Data
JavaScript
70
star
11

digdag-docs

Documents for Digdag Workflow Engine
HTML
50
star
12

td

CUI Interface
Ruby
49
star
13

elastic-beanstalk-td-agent

Example of installing td-agent on AWS Elastic Beanstalk (see .ebextentions directory)
Ruby
49
star
14

td-client-python

Treasure Data API library for Python
Python
47
star
15

pandas-td

Interactive data analysis with Pandas and Treasure Data.
Python
38
star
16

angular-treasure-overlay-spinner

Add a spinner to an element when binding is truthy.
JavaScript
36
star
17

kafka-fluentd-consumer

Kafka Consumer for Fluentd
Java
32
star
18

td-logger-ruby

Treasure Data logging library for Ruby / Rails
Ruby
27
star
19

luigi-td-example

Example Repository for Building Complex Data Pipeline with Luigi +TD
Python
24
star
20

td-ios-sdk

iOS SDK for Treasure Data
Objective-C
23
star
21

td-client-ruby

Ruby Client Library for Treasure Data
Ruby
23
star
22

td-android-sdk

Android SDK for Treasure Data
Java
22
star
23

heroku-td-agent

Treasure Agent on Heroku platform (accept HTTP logging)
Ruby
20
star
24

pytd

Treasure Data Driver for Python
Jupyter Notebook
18
star
25

luigi-td

Luigi Workflow Engine integration for Treasure Data
Python
16
star
26

td-logger-java

Treasure Data Logging Library for Java
Java
12
star
27

fluent-plugin-metricsense

MetricSense - application metrics collection plugin
Ruby
12
star
28

td-client-java

Java Client Library for Treasure Data
Java
12
star
29

td-client-node

Node.js Client Library for Treasure Data
JavaScript
12
star
30

metricsense

MetricSense for Ruby - application metrics collection API
Ruby
11
star
31

td-jdbc

JDBC Driver for Treasure Data
Java
11
star
32

embulk-input-google_analytics

Embulk Input Plugin for Google Analytics
Ruby
11
star
33

td-client-go

Go Client Library for Treasure Data
Go
11
star
34

sqsrun

Generic Amazon SQS Worker Executor Service
Ruby
10
star
35

Lead-List-from-CrunchBase-

Python
10
star
36

embulk-output-td

Embulk output plugin for Treasure Data
Java
10
star
37

td-ue4-sdk

Treasure Data Unreal Engine 4 SDK
C++
10
star
38

fluent-plugin-td-monitoring

Fluentd Plugin for Treasure Agent Monitoring Service
Ruby
10
star
39

stdout-hook

Import your event logs from STDOUT to TD or Fluentd
Ruby
9
star
40

ipython-notebook-examples

iPython notebook examples for Treasure Data
9
star
41

treasuredata_fdw

PostgreSQL Foreign Data Wrapper for Treasure Data
C
8
star
42

embulk-input-zendesk

Embulk Input Plugin for Zendesk
Java
8
star
43

embulk-input-td

Treasure Data Input Plugin for Embulk
Java
8
star
44

embulk-input-mixpanel

Embulk Input Plugin for Mixpanel
Ruby
8
star
45

td-notebooks

Jupyter notebook examples for Treasure Data
Jupyter Notebook
8
star
46

lambda-local-proxy

Local API proxy that calls an AWS Lambda function
Go
7
star
47

embulk-input-marketo

Embulk Input Plugin for Marketo
Java
7
star
48

treasure-academy-cdp

Python
6
star
49

fluent-plugin-td

Fluentd plugin for Treasure Data Service
Ruby
6
star
50

embulk-output-mailchimp

Embulk output plugin for Mailchimp
Java
6
star
51

lda-board

Auto segmentation UI using LDA
Ruby
5
star
52

fluent-plugin-librato-metrics

Librato Metrics output plugin for Fluentd event collector
Ruby
5
star
53

td-logger-python

Python logging module for td-agent
Python
4
star
54

embulk-filter-add_time

Java
4
star
55

embulk-input-jira

Embulk Input Plugin for JIRA
Java
4
star
56

RTD

Simple R client for Treasure Data
HTML
4
star
57

td-unity-sdk-package

Unity SDK for TreasureData
C#
3
star
58

eslint-plugin-td

Stores td-console config so that it can be reused
JavaScript
3
star
59

embulk-parser-query_string

Embulk parser plugin for URL-encoded key value pairs
Ruby
3
star
60

react-treasure-preview-table

console preview table for user data, react component
JavaScript
3
star
61

facebook-open-academy-fluentd-2015

This is the "course page" for Facebook Open Academy 2015 (Winter) for Fluentd
3
star
62

td-cordova-sdk

Treasure Data SDK Cordova Plugin
JavaScript
3
star
63

rsched

Generic Reliable Scheduler
Ruby
3
star
64

js-examples

HTML
3
star
65

hive-udf-neologd

Hive Japanese NLP UDFs with NEologd
Java
2
star
66

angular-treasure-focus-class

Adds a class to an element on focus and removes it when focus is lost.
JavaScript
2
star
67

prestogres-odbc

Fork of PostgreSQL ODBC driver for Prestogres
C
2
star
68

dockerfiles

The collection of Dockerfile
Shell
2
star
69

td-js-consent

This repo is for Treasure Data JavaScript Consent Management UIs
JavaScript
2
star
70

td2slack

Treasure Data to Slack app
Ruby
2
star
71

td-import-java

Treasure Data Import Tool by Java
Java
2
star
72

underwrap

A very thin wrapper of Undertow and Resteasy
Java
2
star
73

TD-API-Documentation-postman-collections

1
star
74

embulk-reporter-fluentd

Java
1
star
75

juju-layer-td-agent

Shell
1
star
76

subtree-deploy

Ruby
1
star
77

heroku-td

Heroku CLI plugin for Treasure Data
Ruby
1
star
78

td-react-native-sdk

Treasure Data React Native SDK
JavaScript
1
star
79

td2email

td2email
Ruby
1
star
80

PodSpecs

Ruby
1
star
81

treasure-academy-sql

1
star
82

td-libyaml

Binary Packaging Scripts for td-libyaml (dependency of td-agent package)
1
star
83

pytd-legacy

[DEPRECATED] This repo is being deprecated. Please check out
Python
1
star