• Stars
    star
    129
  • Rank 279,262 (Top 6 %)
  • Language
    Ruby
  • Created almost 7 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Datadog monitors/dashboards/slos as code, avoid chaotic management via UI

Manage Datadog Monitors / Dashboards / Slos as code

  • DRY, searchable, audited, documented
  • Changes are PR reviewed and applied on merge
  • Updating shows diff before applying
  • Automated import of existing resources
  • Resources are grouped into projects that belong to teams and inherit tags
  • No copy-pasting of ids to create new resources
  • Automated cleanup when removing code
  • Helpers for automating common tasks

Applying changes

Example code

# teams/foo.rb
module Teams
  class Foo < Kennel::Models::Team
    defaults(mention: -> { "@slack-my-team" })
  end
end

# projects/bar.rb
class Bar < Kennel::Models::Project
  defaults(
    team: -> { Teams::Foo.new }, # use mention and tags from the team
    parts: -> {
      [
        Kennel::Models::Monitor.new(
          self, # the current project
          type: -> { "query alert" },
          kennel_id: -> { "load-too-high" }, # pick a unique name
          name: -> { "Foobar Load too high" }, # nice descriptive name that will show up in alerts and emails
          message: -> {
            <<~TEXT
              This is bad!
              #{super()} # inserts mention from team
            TEXT
          },
          query: -> { "avg(last_5m):avg:system.load.5{hostgroup:api} by {pod} > #{critical}" },
          critical: -> { 20 }
        )
      ]
    }
  )
end

Installation

  • create a new private kennel repo for your organization (do not fork this repo)
  • use the template folder as starting point:
    git clone [email protected]:your-org/kennel.git
    git clone [email protected]:grosser/kennel.git seed
    mv seed/template/* kennel/
    cd kennel && git add . && git commit -m 'initial'
  • add a basic projects and teams so others can copy-paste to get started
  • setup CI build for your repo (travis and Github Actions supported)
  • uncomment .travis.yml section for datadog updates on merge (TODO: example setup for Github Actions)
  • follow Setup in your repos Readme.md

Structure

  • projects/ monitors/dashboards/etc scoped by project
  • teams/ team definitions
  • parts/ monitors/dashboards/etc that are used by multiple projects
  • generated/ projects as json, to show current state and proposed changes in PRs

About the models

Kennel provides several classes which act as models for different purposes:

  • Kennel::Models::Dashboard, Kennel::Models::Monitor, Kennel::Models::Slo, Kennel::Models::SyntheticTest; these models represent the various Datadog objects
  • Kennel::Models::Project; a container for a collection of Datadog objects
  • Kennel::Models::Team; provides defaults and values (e.g. tags, mentions) for the other models.

After loading all the *.rb files under projects/, Kennel's starting point is to find all the subclasses of Kennel::Models::Project, and for each one, create an instance of that subclass (via .new) and then call #parts on that instance. parts should return a collection of the Datadog-objects (Dashboard / Monitor / etc).

Model Settings

Each of the models defines various settings; for example, a Monitor has name, message, type, query, tags, and many more.

When defining a subclass of a model, one can use defaults to provide default values for those settings:

class MyMonitor < Kennel::Models::Monitor
  defaults(
    name: "Error rate",
    type: "query alert",
    critical: 5.0,
    query: -> {
      "some datadog metric expression > #{critical}"
    },
    # ...
  )
end

This is equivalent to defining instance methods of those names, which return those values:

class MyMonitor < Kennel::Models::Monitor
  def name
    "Error rate"
  end

  def type
    "query alert"
  end

  def critical
    5.0
  end

  def query
    "some datadog metric expression > #{critical}"
  end
end

except that defaults will complain if you try to use a setting name which doesn't exist. Note also that you can use either plain values (critical: 5.0), or procs (query: -> { ... }). Using a plain value is equivalent to using a proc which returns that same value; use whichever suits you best.

When you instantiate a model class, you can pass settings in the constructor, after the project:

project = Kennel::Models::Project.new
my_monitor = MyMonitor.new(
  project,
  critical: 10.0,
  message: -> {
    <<~MESSAGE
      Something bad is happening and you should be worried.

      #{super()}
    MESSAGE
  },
)

This works just like defaults (it checks the setting names, and it accepts either plain values or procs), but it applies just to this instance of the class, rather than to the class as a whole (i.e. it defines singleton methods, rather than instance methods).

Most of the examples in this Readme use the proc syntax (critical: -> { 5.0 }) but for simple constants you may prefer to use the plain syntax (critical: 5.0).

Workflows

Adding a team

  • mention is used for all team monitors via super()
  • renotify_interval is used for all team monitors (defaults to 0 / off)
  • tags is used for all team monitors/dashboards (defaults to team:<team-name>)
# teams/my_team.rb
module Teams
  class MyTeam < Kennel::Models::Team
    defaults(
      mention: -> { "@slack-my-team" }
    )
  end
end

Adding a new monitor

Updating an existing monitor

  • use datadog monitor UI to find a monitor
  • get the id from the url
  • run URL='https://app.datadoghq.com/monitors/123' bundle exec rake kennel:import and copy the output
  • import task also works with SLO alerts, e.g. URL='https://app.datadoghq.com/slo/edit/123abc456def123/alerts/789' bundle exec rake kennel:import
  • find or create a project in projects/
  • add the monitor to parts: [ list, for example:
# projects/my_project.rb
class MyProject < Kennel::Models::Project
  defaults(
    team: -> { Teams::MyTeam.new }, # use existing team or create new one in teams/
    parts: -> {
      [
        Kennel::Models::Monitor.new(
          self,
          id: -> { 123456 }, # id from datadog url, not necessary when creating a new monitor
          type: -> { "query alert" },
          kennel_id: -> { "load-too-high" }, # make up a unique name
          name: -> { "Foobar Load too high" }, # nice descriptive name that will show up in alerts and emails
          message: -> {
            # Explain what behavior to expect and how to fix the cause
            # Use #{super()} to add team notifications.
            <<~TEXT
              Foobar will be slow and that could cause Barfoo to go down.
              Add capacity or debug why it is suddenly slow.
              #{super()}
            TEXT
          },
          query: -> { "avg(last_5m):avg:system.load.5{hostgroup:api} by {pod} > #{critical}" }, # replace actual value with #{critical} to keep them in sync
          critical: -> { 20 }
        )
      ]
    }
  )
end
  • run PROJECT=my_project bundle exec rake plan, an Update to the existing monitor should be shown (not Create / Delete)
  • alternatively: bundle exec rake generate to only locally update the generated json files
  • review changes then git commit
  • make a PR ... get reviewed ... merge
  • datadog is updated by CI

Deleting

Remove the code that created the resource. The next update will delete it (see above for PR workflow).

Adding a new dashboard

Updating an existing dashboard

  • go to datadog dashboard UI and click on New Dashboard to find a dashboard
  • get the id from the url
  • run URL='https://app.datadoghq.com/dashboard/bet-foo-bar' bundle exec rake kennel:import and copy the output
  • find or create a project in projects/
  • add a dashboard to parts: [ list, for example:
class MyProject < Kennel::Models::Project
  defaults(
    team: -> { Teams::MyTeam.new }, # use existing team or create new one in teams/
    parts: -> {
      [
        Kennel::Models::Dashboard.new(
          self,
          id: -> { "abc-def-ghi" }, # id from datadog url, not needed when creating a new dashboard
          title: -> { "My Dashboard" },
          description: -> { "Overview of foobar" },
          template_variables: -> { ["environment"] }, # see https://docs.datadoghq.com/api/?lang=ruby#timeboards
          kennel_id: -> { "overview-dashboard" }, # make up a unique name
          layout_type: -> { "ordered" },
          definitions: -> {
            [ # An array or arrays, each one is a graph in the dashboard, alternatively a hash for finer control
              [
                # title, viz, type, query, edit an existing graph and see the json definition
                "Graph name", "timeseries", "area", "sum:mystats.foobar{$environment}"
              ],
              [
                # queries can be an Array as well, this will generate multiple requests
                # for a single graph
                "Graph name", "timeseries", "area", ["sum:mystats.foobar{$environment}", "sum:mystats.success{$environment}"],
                # add events too ...
                events: [{q: "tags:foobar,deploy", tags_execution: "and"}]
              ]
            ]
          }
        )
      ]
    }
  )
end

Updating existing resources with id

Setting id makes kennel take over a manually created datadog resource. When manually creating to import, it is best to remove the id and delete the manually created resource.

When an id is set and the original resource is deleted, kennel will fail to update, removing the id will cause kennel to create a new resource in datadog.

Organizing projects with many resources

When project files get too long, this structure can keep things bite-sized.

# projects/project_a/base.rb
module ProjectA
  class Base < Kennel::Models::Project
    defaults(
      kennel_id: -> { "project_a" },
      parts: -> {
        [
          Monitors::FooAlert.new(self),
          ...
        ]
      }
      ...

# projects/project_a/monitors/foo_alert.rb
module ProjectA
  module Monitors
    class FooAlert < Kennel::Models::Monitor
      ...

Updating a single project or resource

  • Use PROJECT=<kennel_id> for single project:

    Use the projects kennel_id (and if none is set then snake_case of the class name including modules) to refer to the project. For example for class ProjectA use PROJECT=project_a but for Foo::ProjectA use foo_project_a.

  • Use TRACKING_ID=<project-kennel_id>:<resource-kennel_id> for single resource:

    Use the project kennel_id and the resources kennel_id, for example class ProjectA and FooAlert would give project_a:foo_alert.

Skipping validations

Some validations might be too strict for your usecase or just wrong, please open an issue and to unblock use the validate: -> { false } option.

Linking resources with kennel_id

Link resources with their kennel_id in the format project kennel_id + : + resource kennel_id, this should be used to create dependent resources like monitor + slos, so they can be created in a single update and can be re-created if any of them is deleted.

Resource Type Syntax
Dashboard uptime monitor: {id: "foo:bar"}
Dashboard alert_graph alert_id: "foo:bar"
Dashboard slo slo_id: "foo:bar"
Monitor composite query: -> { "%{foo:bar} && %{foo:baz}" }
Monitor slo alert query: -> { "error_budget(\"%{foo:bar}\").over(\"7d\") > 123.0" }
Slo monitor monitor_ids: -> ["foo:bar"]

Debugging changes locally

  • rebase on updated master to not undo other changes
  • figure out project name by converting the class name to snake_case
  • run PROJECT=foo bundle exec rake kennel:update_datadog to test changes for a single project (monitors: remove mentions while debugging to avoid alert spam)
    • use PROJECT=foo,bar,... for multiple projects

Reuse

Add to parts/<folder>.

module Monitors
  class LoadTooHigh < Kennel::Models::Monitor
    defaults(
      name: -> { "#{project.name} load too high" },
      message: -> { "Shut it down!" },
      type: -> { "query alert" },
      query: -> { "avg(last_5m):avg:system.load.5{hostgroup:#{project.kennel_id}} by {pod} > #{critical}" }
    )
  end
end

Reuse it in multiple projects.

class Database < Kennel::Models::Project
  defaults(
    team: -> { Kennel::Models::Team.new(mention: -> { '@slack-foo' }, kennel_id: -> { 'foo' }) },
    parts: -> { [Monitors::LoadTooHigh.new(self, critical: -> { 13 })] }
  )
end

Helpers

Listing un-muted alerts

Run rake kennel:alerts TAG=service:my-service to see all un-muted alerts for a given datadog monitor tag.

Validating mentions work

rake kennel:validate_mentions should run as part of CI

Grepping through all of datadog

rake kennel:dump > tmp/dump
cat tmp/dump | grep foo

focus on a single type: TYPE=monitors

Show full resources or just their urls by pattern:

rake kennel:dump_grep DUMP=tmp/dump PATTERN=foo URLS=true
https://foo.datadog.com/dasboard/123
https://foo.datadog.com/monitor/123

Find all monitors with No-Data

rake kennel:nodata TAG=team:foo

Finding the tracking id of a resource

When trying to link resources together, this avoids having to go through datadog UI.

rake kennel:tracking_id ID=123 RESOURCE=monitor

Development

Benchmarking

  • Setting FORCE_GET_CACHE=true will cache all get requests, which makes benchmarking improvements more reliable.
  • Setting STORE=false will make rake plan not update the files on disk and save a bit of time

Integration testing

rake play
cd template
rake plan

Then make changes to play around, do not commit changes and make sure to revert with a rake kennel:update_datadog after deleting everything.

To make changes via the UI, make a new free datadog account and use it's credentaisl instead.

Author

Michael Grosser
[email protected]
License: MIT
CI

More Repositories

1

parallel

Ruby: parallel processing made simple and fast
Ruby
4,052
star
2

parallel_tests

Ruby: 2 CPUs = 2x Testing Speed for RSpec, Test::Unit and Cucumber
Ruby
3,257
star
3

pru

Pipeable Ruby - forget about grep / sed / awk / wc ... use pure, readable Ruby!
Ruby
579
star
4

smusher

Ruby/CLI: Automatic lossless reduction of all your images
Ruby
555
star
5

maxitest

Minitest + all the features you always wanted.
Ruby
441
star
6

fast_gettext

Ruby GetText, but 12x faster + 530x less garbage + simple + clean namespace + threadsafe + extendable + multiple backends
Ruby
390
star
7

wwtd

WWTD: Travis simulator - faster + no more waiting for build emails
Ruby
366
star
8

rspec-instafail

Show failing specs instantly
Ruby
272
star
9

gettext_i18n_rails

Rails: FastGettext, I18n integration -- simple, threadsafe and fast!
Ruby
257
star
10

test_after_commit

Make after_commit callbacks fire in tests for Rails 3+ with transactional_fixtures = true.
Ruby
240
star
11

rpx_now

Ruby: RPXNow.com user login/creation and view helpers Facebook, Twitter, Google, MSN, OpenID, MySpace, Yahoo -- All in One
Ruby
230
star
12

single_cov

Actionable code coverage.
Ruby
226
star
13

bitfields

n Booleans = 1 Integer, saves columns and migrations.
Ruby
221
star
14

i18n_data

Ruby: country/language names and 2-letter-code pairs, in 85 languages, for country/language i18n
Ruby
186
star
15

vendorer

Vendorer keeps your dependencies documented, cached and up to date
Ruby
186
star
16

ar_after_transaction

Execute irreversible actions only when transactions are not rolled back
Ruby
155
star
17

url_store

Data securely stored in urls.
Ruby
146
star
18

ruco

Desktop-style, Intuitive, Commandline Editor in Ruby. "Better than nano, simpler than vim."
Ruby
125
star
19

programming_pearls

eBook: Programming Pearls Rewritten in Ruby
Ruby
108
star
20

easy_esi

Rails: Cached pages with updated partials
Ruby
106
star
21

reduce

Ruby/CLI: minify javascript + stylesheets, lossless image optimization
JavaScript
90
star
22

git-autobisect

Find the first broken commit without having to learn git bisect
Ruby
84
star
23

parallel_split_test

Split a big test file into multiple chunks and run them in parallel
Ruby
81
star
24

tic_tac_toe

Play Tic-Tac-Toe in Ruby using Curses(full-screen-commandline app)
Ruby
71
star
25

sort_alphabetical

Ruby: sort UTF8 Strings alphabetical via Enumerable extension
Ruby
69
star
26

soft_deletion

Explicit soft deletion for ActiveRecord via deleted_at and default scope
Ruby
68
star
27

youtube_search

Search youtube via this simple ruby api
Ruby
67
star
28

simple_auto_complete

Rails: Simple, customizable, unobstrusive - Autocomplete
JavaScript
64
star
29

dispel

Ruby: Remove evil curses
Ruby
63
star
30

preoomkiller

Softly kills your process with SIGTERM before it runs out of memory.
Ruby
63
star
31

single_test

Rake tasks to invoke single tests/specs with rakish syntax
Ruby
60
star
32

dotfiles

Clean and powerful dotfiles -- bash / git / ruby / irb / nano / ruco
Shell
57
star
33

bundler-organization_audit

Automatic Gemfile security audit for all your organizaition/user repos
Ruby
52
star
34

fallback

Fallback when original attribute is not present or somethings not right.
Ruby
47
star
35

record_activities

Rails: Record user activities without controller helpers, build on top of userstamps plugin
Ruby
47
star
36

ie_iframe_cookies

Rails: Normal cookies inside IFrames for IE via P3P headers
Ruby
45
star
37

cachy

Ruby: Caching library to simplify and organize caching
Ruby
44
star
38

zombie_passenger_killer

Guaranteed zombie passengers death.
Ruby
44
star
39

tracked_plugins

script/plugin now keeps track of installation, can list urls/revisions/install-dates/plugin-locally-hacked? and update.
Ruby
43
star
40

stub_server

Boot up a real server to serve testing replies
Ruby
43
star
41

rails2_asset_pipeline

Familiar asset handling for those stuck on Rails 2
Ruby
41
star
42

request_recorder

Record your rack/rails requests and store them for future inspection
Ruby
40
star
43

ar_merge

Merge 2 ActiveRecords, preserving attributes, associations and counters
Ruby
40
star
44

gem-dependent

How many gems depend on your gem ?
Ruby
39
star
45

travis_dedup

Stop all builds on the same PR when a new job starts
Ruby
38
star
46

safe_regexp

Ruby Regex Timeout / Backtracking Bomb Safety
Ruby
30
star
47

translation_db_engine

Rails/AR: engine to manage translations inside a database
Ruby
30
star
48

rubinjam

Covert ruby gem to universal cross-platform binary
Ruby
30
star
49

gettext_i18n_rails_example

Rails example application using FastGettext + gettext_i18n_rails + gettext_test_log
Ruby
29
star
50

url_to_media_tag

Convert an Youtube/Vimeo/Image... Url to image or video embed.
Ruby
27
star
51

forking_test_runner

Run every test in a fork to avoid pollution and get clean output per test
Ruby
27
star
52

restful_catch_all_route

One rule for complete restful routing, no helpers, no worries.
Ruby
26
star
53

scopify

Add named scopes and chainable scopes to any Object / Model.
Ruby
24
star
54

key_value

Abuse Sql database as Key-Value Store
Ruby
22
star
55

acts_as_feed

Rails/AR: Transform a Model into a Feed Representation (Feed Reader)
Ruby
22
star
56

sinatra-magick

Sinatra app to manipulate images given by url via mini_magick and image_magick. completly evented
Ruby
22
star
57

concern

Ruby: Seperation of concerns without meta-madness and namespace pollution.
Ruby
22
star
58

has_a_location

AR: Easy location (lat/long) handling + in_radius + find on a given map section
Ruby
22
star
59

virtual_asset_path

Improve Rails Asset Caching with MD5 and virtual folders
Ruby
21
star
60

s3_slider

jQuery: slideshow displaying images + description, ~1kb for js+css, simple and elegant
HTML
21
star
61

git-whence

Find the merge and pull request a commit came from + fuzzy search for cherry-picks
Ruby
21
star
62

readable_random

Ruby: Readable random strings for coupons or tokens
Ruby
20
star
63

testrbl

Run ruby Minitest/Test::Unit/Spec/Shoulda tests by line-number / files / folder
Ruby
20
star
64

s3_meta_sync

Efficiently sync folders with s3 using a metadata file with md5 sums.
Ruby
20
star
65

countries_and_languages

Rails: Countries and languages in I18n.locale for select_tag or output in 85 languages
Ruby
19
star
66

autoscaling

Amazon AWS/EC2 autoscaling All in one
Shell
19
star
67

helpful_fields

Simple & Helpful Field Helpers for Rails e.g. check_box_with_label or prefilled fields from params
Ruby
18
star
68

go-testcov

`go test` that fails on uncovered lines and shows them
Go
18
star
69

cleanser

Find polluting test by bisecting your tests.
Ruby
17
star
70

repo_dependency_graph

Graph the dependencies of your repositories
Ruby
16
star
71

get_pomo

Ruby/Gettext: A .mo and .po file parser/generator
Ruby
16
star
72

honeypot

Rails: Simple honeypots
Ruby
16
star
73

random_records

Rails/AR: Fast random records for ActiveRecord
Ruby
16
star
74

rpx_now_example

Example Rails app using RPXNow plugin
Ruby
16
star
75

logrecycler

Re-process logs from applications you cannot modify to convert them to json and add prometheus/stats metrics
Go
15
star
76

after_commit_exception_notification

Rails: Get notified when an after_commit block blows up
Ruby
15
star
77

textpow

Read TextMate syntax files and parse text with them
Ruby
14
star
78

codeclimate_batch

Report a batch of codeclimate results by merging and from multiple servers
Ruby
13
star
79

dockerb

Dockerfile.erb - use ruby in your dynamic Dockerfile
Ruby
12
star
80

cia

Central Internal Auditing: Audit model events like update/create/delete + attribute changes + grouped them by transaction, in normalized table layout for easy query access.
Ruby
12
star
81

cc-amend

Unify reports from all your tests runs and send them as one.
Ruby
12
star
82

gem_of_thrones

Everybody wants to be king, but only one can rule (synchronized via a distributed cache)
Ruby
12
star
83

github-grep

Grep through github search results
Ruby
11
star
84

kube-leader

Simple Kubernetes Leader Election via ConfigMap as ENTRYPOINT
Go
10
star
85

travis_cron

Run travis as cron (also supports travis PRO)
Ruby
10
star
86

air_man

Email notifications for high-frequency Airbrake errors
Ruby
10
star
87

i18n-backend-http

Rails I18n Backend for Http APIs with etag-aware distributed background polling and lru-memory+[memcache] caching.
Ruby
9
star
88

gem_on_demand

Run your own gem server that fetches from github, uses tags as version and builds gems on demand
Ruby
9
star
89

matching_bundle

Find a matching bundler version for a Gemfile and use it
Ruby
9
star
90

active_record-comments

Add comments to ActiveRecord queries to see where they came from or what user caused them
Ruby
9
star
91

translated_attributes

AR/Rails translatable attributes through virtual fields
Ruby
9
star
92

cmd2json

Covert command output and exit status to json to pipe them atomically into logs
Ruby
8
star
93

autolang

Automatic translation to a new language for Gettext/JSON using Google translate
Ruby
8
star
94

db_graph

Easy graphs from AR date fields
Ruby
8
star
95

organization_license_audit

Audit all licenses used by your github organization/user
Ruby
8
star
96

ruby-cli-daemon

Make all gem executables execute instantly
Ruby
8
star
97

ar_multi_threaded_transactional_tests

Execute multithreaded code while still using transactional fixtures by synchronizing db access to a single connection
Ruby
8
star
98

rhr

Ruby Hypertext Refinement -- the ease of PHP with the elegance of Ruby
Ruby
7
star
99

git-graph

Date porn from your git history
Ruby
7
star
100

unicorn_wrangler

Unicorn: out of band GC / restart on max memory bloat / restart after X requests
Ruby
7
star