• Stars
    star
    561
  • Rank 75,740 (Top 2 %)
  • Language
    Ruby
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Open source chef cookbooks.

Facebook Cookbooks Suite

Continuous Integration

This repo contains attribute-driven-API cookbooks maintained by Facebook. It's a large chunk of what we refer to as our "core cookbooks."

It's worth reading our Philosophy.md doc before reading this. It covers how we use Chef and is important context before reading this.

It is important to note that these cookbooks are built using a very specific model that's very different from most other cookbooks in the community. Specifically:

  • It assumes an environment in which you want to delegate. The cookbooks are designed to be organized in a "least specific to most specific" order in the run-list. The run-list starts with the "core cookbooks" that setup APIs and enforce a base standard, which can be adjusted by the service owners using cookbooks later in the run-list.
  • It assumes a "run from main" type environment. At Facebook we use Grocery Delivery to sync the main branch of our source control repo with all of our Chef servers. Grocery Delivery is not necessary to use these cookbooks, but since they were built with this model in mind, the versions never change (relatedly: we do not use environments).
  • It assumes you have a testing toolset that allows anyone modifying later cookbooks to ensure that their use of the API worked as expected on a live node before committing. For this, we use Taste Tester.

Cookbooks in this repo all being with fb_ to denote that not only do they use the Facebook Cookbook Model, but that they are maintained in this repo.

Local cookbooks or cookbooks in other repositories that implement this model should not use this prefix, but should reference this document in their docs.

APIs

Unlike other cookbook models, we do not use resources as APIs, we use the node object. Configuration is modeled in arrays and hashes as closely and thinly as possible to the service we are configuring. Ideally, you should only have to read the docs to the service to configure it, not the docs to the cookbook.

For example, if the service we are configuring has a key-value pair configuration file, we will provide a simple hash where keys and values will be directly put into the necessary configuration file.

There are two reasons we use attribute-driven APIs:

  1. Cascading configuration Since our cookbooks are ordered least specific (core team that owns Chef) to most specific (the team that owns this machine or service) it means that the team who cares about this specific instance can always override anything. This enables stacking that is not possible in many other models. For example, you can have a run-list that looks like:

    • Core cookbooks (the ones in this repo)
    • Site/Company cookbooks (site-specific settings)
    • Region cookbooks (overrides for a given region/cluster)
    • Application Category cookbooks (webserver, mail server, etc.)
    • Specific Application cookbook ("internal app1 server")

    So let's say that you want a specific amount of shared memory by default, but in some region you know you have different size machines, so you shrink it, but web servers need a further different setting, and then finally some specific internal webserver needs an even more specific setting... this all just works.

    Further, a cookbook can see the value that was set before it modifies things, so the 'webserver' cookbook could look to see what the value was (small or large) before modifying it and adjust it accordingly (so it could be relative to the size of memory that the 'region' cookbook set).

    Using resources for this does not allow this "cascading", it instead creates "fighting". If you use the cron resource to setup an hourly job, and then someone else creates a cron for that same job but only twice a day, then during each Chef run the cron job gets modified to hourly, then re-modified to twice a day.

  2. Allows for what we refer to as "idempotent systems" instead of "idempotent settings." In other words, if you only manage a specific item in a larger config, and then you stop managing it, it should either revert to a less-specific setting (see #1) or be removed, as necessary.

    For example let's say you want to set a cron job. If you use the internal cron resource, and then delete the recipe code that adds that cronjob, that cron isn't removed from your production environment - it's on all existing nodes, but not on any new nodes.

    For this reason we use templates to take over a whole configuration wherever possible. All cron jobs in our fb_cron API are written to /etc/cron.d/fb_crontab. If you delete the lines adding a cronjob, since they are just entries in a hash, when the template is generated on the next Chef run, those crons go away.

    Alternatively, consider a sysctl set by the "site" cookbook, then overwritten by a later cookbook. When that later code is removed, the entry in the hash falls back to being set again by the next-most-specific value (i.e. the "site" cookbook in this case).

Run-lists

How you formulate your run-lists is up to your site, as long as you follow the basic rule that core cookbooks come first and you order least-specific to most-specific. At Facebook, all of our run-lists are:

recipe[fb_init], recipe[$SERVICE]

Where fb_init is similar to the sample provided in this repo, but with extra "core cookbooks."

We generally think of this way: fb_init should make you a "Facebook server" and the rest should make you a whatever-kind-of-server-you-are.

Getting started

Grab a copy of the repo, rename fb_init_sample to fb_init, and follow the instructions in its README.md (coordinating guidance is in comments in the default recipe).

Other Guidelines

Modules and classes in cookbooks

It is often useful to factor out logic into a library - especially logic that doesn't create resources. Doing so makes this logic easier to unit test and makes the recipe or resource cleaner.

Our standard is that all cookbooks use the top-level container of module FB, and then create a class for their cookbook under that. For example, fb_fstab creates a class Fstab inside of the module FB. We will refer to this as the cookbook class from here.

We require all cookbooks use this model for consistency.

Since we don't put anything other than other classes inside the top-level object, it's clear that a module is the right choice.

While there is no reason that a cookbook class can't be one designed to be instantiated, more often than not it is simply a collection of class methods and constants (i.e. static data and methods that can then be called both from this cookbook and others).

Below the cookbook class, the author is free to make whatever class or methods they desire.

When building a complicated Custom Resource, the recommended pattern is to factor out the majority of the logic into a module, inside of the cookbook class, that can be included in the action_class. This allows the logic to be easily unit tested using standard rspec. It is preferred for this module to be in its own library file, and for its name to end in Provider, ala FB::Thing::ThingProvider.

When more than 1 or 2 methods from this module are called from the custom resource itself, it is highly recommended you include it in a Helper class for clarity, ala:

action_class do
  class ThingHelper
    include FB::Thing::ThingProvider
  end
end

In this way, it is clear where methods come from.

Extending the node vs self-contained classes

You may have noticed that some of our cookbooks will extend the node object, while others have self-contained classes that sometimes require the node be passed as a parameter to some methods.

In general, the only time when extending the node is acceptable is when you are simply making a convenience function around using the node object. So, for example, instead of making people do node['platform_family'] = 'debian', there's a node.debian?. This is simply syntactic sugar on top of data entirely in the node.

In all other cases, one should simply have the node be an argument passed on, so as to not pollute the node namespace. For example, a method that looks at the node attributes, but also does a variety of other logic, should be in a cookbook class and take the node as an argument (per standard programming paradigms about clear dependencies).

Methods in recipes

Sometimes it is convenient to put a method directly in a recipe. It is strongly preferred to put these methods in the cookbook class, however there are some cases where methods directly in recipes make sense. The primary example is a small method which creates a resource based on some input to make a set of loops more readable.

Methods in templates

Methods should not be put into templates. In general, as little logic as possible should be in templates. In general the easiest way to do this is to put the complex logic into methods in your cookbook class and call them from the templates.

Err on the side of fail

Chef is an ordered system and thus is designed to fail a run if a resource cannot be converged. The reason for this is that if one step in an ordered list cannot be completed, it's likely not safe to do at least some of the following steps. For example, if you were not able to write the correct configuration for a service, then starting it may open up a security vulnerability.

Likewise, the Facebook cookbooks will err on the side of failing if something seems wrong. This is both in line with the Chef philosophy we just outlined, but also because this model assumes that code is being tested on real systems before being released using something like taste-tester and that monitoring is in place to know if your machines are successfully running Chef.

Here are some examples of this philosophy in practice:

  • If a cookbook is included on a platform it does not support, we fail. It might seem like returning in this case is reasonable but there is a good indication the run-list isn't as-expected, so it's a great idea to bail out before this machine is mis-configured.
  • If a configuration was passed in that we don't support, rather than ignore it we fail.

Validation of inputs and whyrun_safe_ruby_blocks

Many cookbooks rely on the service underneath and the testing of the user to be the primary validator of inputs. Is the software we just configured, behaving as expected?

However, sometimes it's useful to do our own validation because there are certain configurations we don't want to support, because the software may accepted dangerous configurations we want to catch, or because the user could pass us a combination of configurations that is conflicting or impossible to implement.

In this model, however, this must be done at runtime. If your implementation is done primarily inside of an internally-called resource, then this validation can also be done there. However, if your implementation is primarily a recipe and templates, doing the validation in templates is obviously not desirable. This is where whyrun_safe_ruby_blocks come in.

Using an ordinary ruby_block would suffice to have ruby code run at runtime to validate the attributes, however that means that the error would not be caught in whyrun mode. Since this validation does not change the system, it is safe to execute in whyrun mode, and that's why we use whyrun_safe_ruby_blocks: they are run in whyrun mode.

It is worth noting that this is also where you can take input that perhaps was in a structure convenient for users and build out a different data structure that's more convenient to use in your template.

Implementing runtime-safe APIs

This model intentionally draws the complexity of Chef into the "core cookbooks" (those implementing APIs) so that the user experience of maintaining systems is simple and (usually) requires little more than writing to the node object. However, the trade-off for that simplicity is that implementing the API properly can be quite tricky.

How to do this is a large enough topic that it gets its own document. However, some style guidance is also useful. This section assumes you have read the aforementioned document.

The three main ways that runtime-safety is achieved are lazy, templates, and custom resources. When should you use which?

The template case is fairly straight forward - if you have a template, read the node object from the within the template source instead of using variables on the template resource, and all data read is inherently runtime safe since templates run at runtime.

But what about lazy vs custom resources? For example, in a recipe you might do:

package 'thingy packages' do
  package_name lazy {
    pkgs = 'thingy'
    if node['fb_thingy']['want_devel']
      pkgs << 'thingy-devel'
    end
    pkgs
  }
  action :upgrade
end

Where as inside of a custom resource you could instead do:

pkgs = 'thingy'
if node['fb_thingy']['want_devel']
  pkgs << 'thingy-devel'
end

package pkgs do
  action :upgrade
end

Which one is better? There's not an exact answer, both work, so it's a style consideration. In general, there are two times when we suggest a custom resource:

The first is when you need to loop over the node in order to even know what resources to create. Since this isn't possible to (well, technically it's possible with some ugliness, but by and large not using the standard DSL), this must go into a custom resource. Example might be:

# This MUST be inside of a custom resource!
node['fb_thingy']['instances'].each do |name, config|
  template "/etc/thingy/#{instance}.conf" do
    owner 'root'
    group 'root'
    mode '644'
    variables({:config => config})
  end
end

The second is when you're using lazy on the majority of the resources in your recipe. If your recipe has 15 resources and you've had to pepper all of them with lazy, it's a bit cleaner to make a custom resource that you call in your recipe.

It's important here to reiterate: we're not referring to using a Custom Resource as an API, but simply making an internal custom resource, called only by your own recipe, as a way to simplify runtime safety.

Outside of these two cases, you should default to implementations inside of recipes. This is for a few reasons.

The first reason is that dropping entire implementations in custom resources leads to confusion and sets a bad precedent for how runtime-safety works. For example, consider the custom resource code we saw earlier where you assemble the package list in "naked" ruby:

pkgs = 'thingy'
if node['fb_thingy']['want_devel']
  pkgs << 'thingy-devel'
end

This code works fine in a resource, but serves as a bad reference for others - since this absolutely won't work in a recipe (even though it'll run).

The second reason is that quite often implementations need both compile-time and runtime code, and by blindly dropping the implementation into a custom resource, you can often miss this and create bugs like this:

# only safe because we're in a custom resource
packages = FB::Thingy.determine_packages(node)

package packages do
  action :upgrade
end

if node['fb_thingy']['want_cron']
  node.default['fb_cron']['jobs']['thingy_runner'] = {
    'time' => '* * * * *',
    'command' => '/usr/bin/thingy --quiet',
  }
end

service 'thingy' do
  action [:enable, :start]
end

Note here that while this code all seems reasonable in a custom resource (if statements are runtime safe when inside of a custom resource), that cronjob will never get picked up, because you're using an API at runtime, but APIs must be called at compile time and consumed at runtime. In reality, this needs to be in the recipe in order to work, and should look like this, in a recipe:

package 'thingy packages' do
  package_name lazy { FB::Thingy.determine_packages(node) }
  action :upgrade
end

node.default['fb_cron']['jobs']['thingy_runner'] = {
  'only_if' => proc { node['fb_thingy']['want_cron'] },
  'time' => '* * * * *',
  'command' => '/usr/bin/thingy --quiet',
}

service 'thingy' do
  action [:enable, :start]
end

In general, always start your implementation as a recipe and then escalate to Custom Resources where necessary.

Debugging kitchen runs

You can set up kitchen using the same commands as in .github/workflows/ci.yml, but once Chef runs you won't have access to connect, so modify fb_sudo/attributes/default.rb and uncomment the kitchen block.

Then you can do bundle exec kitchen login <INSTANCE> after a failed run, and sudo will be passwordless so you can debug.

License

See the LICENSE file in this directory

More Repositories

1

react

The library for web and native user interfaces.
JavaScript
218,990
star
2

react-native

A framework for building native applications using React
C++
114,975
star
3

create-react-app

Set up a modern web app by running one command.
JavaScript
101,534
star
4

docusaurus

Easy to maintain open source documentation websites.
TypeScript
51,169
star
5

jest

Delightful JavaScript Testing.
TypeScript
41,554
star
6

rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
C++
26,895
star
7

folly

An open-source C++ library developed and used at Facebook.
C++
26,731
star
8

flow

Adds static typing to JavaScript to improve developer productivity and code quality.
OCaml
22,040
star
9

zstd

Zstandard - Fast real-time compression algorithm
C
21,685
star
10

relay

Relay is a JavaScript framework for building data-driven React applications.
Rust
18,099
star
11

hhvm

A virtual machine for executing programs written in Hack.
C++
17,960
star
12

prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Python
17,624
star
13

fresco

An Android library for managing images and the memory they use.
Java
17,026
star
14

lexical

Lexical is an extensible text editor framework that provides excellent reliability, accessibility and performance.
TypeScript
16,985
star
15

yoga

Yoga is a cross-platform layout engine which implements Flexbox. Follow https://twitter.com/yogalayout for updates.
C++
16,729
star
16

infer

A static analyzer for Java, C, C++, and Objective-C
OCaml
14,599
star
17

flipper

A desktop debugging platform for mobile developers.
TypeScript
13,124
star
18

watchman

Watches files and records, or triggers actions, when they change.
C++
12,124
star
19

react-devtools

An extension that allows inspection of React component hierarchy in the Chrome and Firefox Developer Tools.
11,024
star
20

hermes

A JavaScript engine optimized for running React Native.
C++
9,167
star
21

chisel

Chisel is a collection of LLDB commands to assist debugging iOS apps.
Python
9,072
star
22

jscodeshift

A JavaScript codemod toolkit.
JavaScript
8,850
star
23

buck

A fast build system that encourages the creation of small, reusable modules over a variety of platforms and languages.
Java
8,568
star
24

proxygen

A collection of C++ HTTP libraries including an easy to use HTTP server.
C++
7,978
star
25

stylex

StyleX is the styling system for ambitious user interfaces.
JavaScript
7,921
star
26

facebook-ios-sdk

Used to integrate the Facebook Platform with your iOS & tvOS apps.
Swift
7,644
star
27

litho

A declarative framework for building efficient UIs on Android.
Java
7,633
star
28

pyre-check

Performant type-checking for python.
OCaml
6,620
star
29

facebook-android-sdk

Used to integrate Android apps with Facebook Platform.
Kotlin
6,020
star
30

redex

A bytecode optimizer for Android apps
C++
5,951
star
31

componentkit

A React-inspired view framework for iOS.
Objective-C++
5,740
star
32

sapling

A Scalable, User-Friendly Source Control System.
Rust
5,635
star
33

fishhook

A library that enables dynamically rebinding symbols in Mach-O binaries running on iOS.
C
5,061
star
34

PathPicker

PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything. After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
Python
5,033
star
35

metro

🚇 The JavaScript bundler for React Native
JavaScript
4,996
star
36

prop-types

Runtime type checking for React props and similar objects
JavaScript
4,427
star
37

idb

idb is a flexible command line interface for automating iOS simulators and devices
Objective-C
4,356
star
38

Haxl

A Haskell library that simplifies access to remote data, such as databases or web-based services.
Haskell
4,220
star
39

FBRetainCycleDetector

iOS library to help detecting retain cycles in runtime.
Objective-C++
4,178
star
40

memlab

A framework for finding JavaScript memory leaks and analyzing heap snapshots
TypeScript
4,088
star
41

duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Haskell
3,995
star
42

fbt

A JavaScript Internationalization Framework
JavaScript
3,836
star
43

regenerator

Source transformer enabling ECMAScript 6 generator functions in JavaScript-of-today.
JavaScript
3,795
star
44

mcrouter

Mcrouter is a memcached protocol router for scaling memcached deployments.
C++
3,186
star
45

buck2

Build system, successor to Buck
Rust
3,177
star
46

wangle

Wangle is a framework providing a set of common client/server abstractions for building services in a consistent, modular, and composable way.
C++
3,016
star
47

wdt

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
C++
2,827
star
48

igl

Intermediate Graphics Library (IGL) is a cross-platform library that commands the GPU. It provides a single low-level cross-platform interface on top of various graphics APIs (e.g. OpenGL, Metal and Vulkan).
C++
2,674
star
49

fbthrift

Facebook's branch of Apache Thrift, including a new C++ server.
C++
2,513
star
50

mysql-5.6

Facebook's branch of the Oracle MySQL database. This includes MyRocks.
C++
2,423
star
51

Ax

Adaptive Experimentation Platform
Python
2,226
star
52

jsx

The JSX specification is a XML-like syntax extension to ECMAScript.
HTML
1,941
star
53

fbjs

A collection of utility libraries used by other Meta JS projects.
JavaScript
1,939
star
54

react-native-website

The React Native website and docs
JavaScript
1,875
star
55

screenshot-tests-for-android

Generate fast deterministic screenshots during Android instrumentation tests
Java
1,727
star
56

idx

Library for accessing arbitrarily nested, possibly nullable properties on a JavaScript object.
JavaScript
1,687
star
57

TextLayoutBuilder

An Android library that allows you to build text layouts more easily.
Java
1,464
star
58

mvfst

An implementation of the QUIC transport protocol.
C++
1,384
star
59

SoLoader

Native code loader for Android
Java
1,269
star
60

facebook-python-business-sdk

Python SDK for Meta Marketing APIs
Python
1,211
star
61

ThreatExchange

Trust & Safety tools for working together to fight digital harms.
C++
1,092
star
62

mariana-trench

A security focused static analysis tool for Android and Java applications.
C++
1,022
star
63

CacheLib

Pluggable in-process caching engine to build and scale high performance services
C++
1,018
star
64

fatal

Fatal is a library for fast prototyping software in modern C++. It provides facilities to enhance the expressive power of C++. The library is heavily based on template meta-programming, while keeping the complexity under-the-hood.
C++
993
star
65

transform360

Transform360 is an equirectangular to cubemap transform for 360 video.
C
991
star
66

openr

Distributed platform for building autonomic network functions.
C++
879
star
67

fboss

Facebook Open Switching System Software for controlling network switches.
C++
842
star
68

facebook-php-business-sdk

PHP SDK for Meta Marketing API
PHP
787
star
69

ktfmt

A program that reformats Kotlin source code to comply with the common community standard for Kotlin code conventions.
Kotlin
776
star
70

winterfell

A STARK prover and verifier for arbitrary computations
Rust
691
star
71

pyre2

Python wrapper for RE2
C++
629
star
72

openbmc

OpenBMC is an open software framework to build a complete Linux image for a Board Management Controller (BMC).
C
607
star
73

SPARTA

SPARTA is a library of software components specially designed for building high-performance static analyzers based on the theory of Abstract Interpretation.
C++
604
star
74

IT-CPE

Meta's Client Platform Engineering tools. Some of the tools we have written to help manage our fleet of client systems.
Ruby
553
star
75

time

Meta's Time libraries
Go
471
star
76

facebook-nodejs-business-sdk

Node.js SDK for Meta Marketing APIs
JavaScript
464
star
77

facebook-sdk-for-unity

The facebook sdk for unity.
C#
461
star
78

lexical-ios

Lexical iOS is an extensible text editor framework that integrates the APIs and philosophies from Lexical Web with a Swift API built on top of TextKit.
Swift
446
star
79

Rapid

The OpenStreetMap editor driven by open data, AI, and supercharged features
JavaScript
425
star
80

FAI-PEP

Facebook AI Performance Evaluation Platform
Python
379
star
81

facebook-java-business-sdk

Java SDK for Meta Marketing APIs
Java
374
star
82

chef-utils

Utilities related to Chef
Ruby
287
star
83

opaque-ke

An implementation of the OPAQUE password-authenticated key exchange protocol
Rust
262
star
84

dns

Collection of Meta's DNS Libraries
Go
251
star
85

facebook360_dep

Facebook360 Depth Estimation Pipeline - https://facebook.github.io/facebook360_dep
HTML
238
star
86

akd

An implementation of an auditable key directory
Rust
207
star
87

tac_plus

A Tacacs+ Daemon tested on Linux (CentOS) to run AAA via TACACS+ Protocol via IPv4 and IPv6.
C
205
star
88

facebook-ruby-business-sdk

Ruby SDK for Meta Marketing API
Ruby
200
star
89

dotslash

Simplified executable deployment
Rust
165
star
90

usort

Safe, minimal import sorting for Python projects.
Python
161
star
91

grocery-delivery

The Grocery Delivery utility for managing cookbook uploads to distributed Chef backends.
Ruby
151
star
92

taste-tester

Software to manage a chef-zero instance and use it to test changes on production servers.
Ruby
144
star
93

TestSlide

A Python test framework
Python
139
star
94

homebrew-fb

OS X Homebrew formulas to install Meta open source software
Ruby
122
star
95

sapp

Post Processor for Facebook Static Analysis Tools.
Python
122
star
96

squangle

SQuangLe is a C++ API for accessing MySQL servers
C++
119
star
97

threat-research

Welcome to the Meta Threat Research Indicator Repository, a dedicated resource for the sharing of Indicators of Compromise (IOCs) and other threat indicators with the external research community
Python
115
star
98

ocamlrep

Sets of libraries and tools to write applications and libraries mixing OCaml and Rust. These libraries will help keeping your types and data structures synchronized, and enable seamless exchange between OCaml and Rust
Rust
97
star
99

bpfilter

BPF-based packet filtering framework
C
79
star
100

facebook-business-sdk-codegen

Codegen project for our business SDKs
PHP
74
star