• Stars
    star
    11,302
  • Rank 2,795 (Top 0.06 %)
  • Language
    Go
  • License
    MIT License
  • Created about 8 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

GitHub's Online Schema-migration Tool for MySQL

gh-ost

ci replica-tests downloads release

GitHub's online schema migration for MySQL

gh-ost is a triggerless online schema migration solution for MySQL. It is testable and provides pausability, dynamic control/reconfiguration, auditing, and many operational perks.

gh-ost produces a light workload on the master throughout the migration, decoupled from the existing workload on the migrated table.

It has been designed based on years of experience with existing solutions, and changes the paradigm of table migrations.

How?

All existing online-schema-change tools operate in similar manner: they create a ghost table in the likeness of your original table, migrate that table while empty, slowly and incrementally copy data from your original table to the ghost table, meanwhile propagating ongoing changes (any INSERT, DELETE, UPDATE applied to your table) to the ghost table. Finally, at the right time, they replace your original table with the ghost table.

gh-ost uses the same pattern. However it differs from all existing tools by not using triggers. We have recognized the triggers to be the source of many limitations and risks.

Instead, gh-ost uses the binary log stream to capture table changes, and asynchronously applies them onto the ghost table. gh-ost takes upon itself some tasks that other tools leave for the database to perform. As result, gh-ost has greater control over the migration process; can truly suspend it; can truly decouple the migration's write load from the master's workload.

In addition, it offers many operational perks that make it safer, trustworthy and fun to use.

gh-ost general flow

Highlights

  • Build your trust in gh-ost by testing it on replicas. gh-ost will issue same flow as it would have on the master, to migrate a table on a replica, without actually replacing the original table, leaving the replica with two tables you can then compare and satisfy yourself that the tool operates correctly. This is how we continuously test gh-ost in production.
  • True pause: when gh-ost throttles, it truly ceases writes on master: no row copies and no ongoing events processing. By throttling, you return your master to its original workload
  • Dynamic control: you can interactively reconfigure gh-ost, even as migration still runs. You may forcibly initiate throttling.
  • Auditing: you may query gh-ost for status. gh-ost listens on unix socket or TCP.
  • Control over cut-over phase: gh-ost can be instructed to postpone what is probably the most critical step: the swap of tables, until such time that you're comfortably available. No need to worry about ETA being outside office hours.
  • External hooks can couple gh-ost with your particular environment.

Please refer to the docs for more information. No, really, read the docs.

Usage

The cheatsheet has it all. You may be interested in invoking gh-ost in various modes:

  • a noop migration (merely testing that the migration is valid and good to go)
  • a real migration, utilizing a replica (the migration runs on the master; gh-ost figures out identities of servers involved. Required mode if your master uses Statement Based Replication)
  • a real migration, run directly on the master (but gh-ost prefers the former)
  • a real migration on a replica (master untouched)
  • a test migration on a replica, the way for you to build trust with gh-ost's operation.

Our tips:

  • Testing above all, try out --test-on-replica first few times. Better yet, make it continuous. We have multiple replicas where we iterate our entire fleet of production tables, migrating them one by one, checksumming the results, verifying migration is good.
  • For each master migration, first issue a noop
  • Then issue the real thing via --execute.

More tips:

  • Use --exact-rowcount for accurate progress indication
  • Use --postpone-cut-over-flag-file to gain control over cut-over timing
  • Get familiar with the interactive commands

Also see:

What's in a name?

Originally this was named gh-osc: GitHub Online Schema Change, in the likes of Facebook online schema change and pt-online-schema-change.

But then a rare genetic mutation happened, and the c transformed into t. And that sent us down the path of trying to figure out a new acronym. gh-ost (pronounce: Ghost), stands for GitHub's Online Schema Transmogrifier/Translator/Transformer/Transfigurator

License

gh-ost is licensed under the MIT license

gh-ost uses 3rd party libraries, each with their own license. These are found here.

Community

gh-ost is released at a stable state, but with mileage to go. We are open to pull requests. Please first discuss your intentions via Issues.

We develop gh-ost at GitHub and for the community. We may have different priorities than others. From time to time we may suggest a contribution that is not on our immediate roadmap but which may appeal to others.

Please see Coding gh-ost for a guide to getting started developing with gh-ost.

Download/binaries/source

gh-ost is now GA and stable.

gh-ost is available in binary format for Linux and Mac OS/X

Download latest release here

gh-ost is a Go project; it is built with Go 1.15 and above. To build on your own, use either:

  • script/build - this is the same build script used by CI hence the authoritative; artifact is ./bin/gh-ost binary.
  • build.sh for building tar.gz artifacts in /tmp/gh-ost

Generally speaking, master branch is stable, but only releases are to be used in production.

Authors

gh-ost is designed, authored, reviewed and tested by the database infrastructure team at GitHub:

More Repositories

1

gitignore

A collection of useful .gitignore templates
156,154
star
2

copilot-docs

Documentation for GitHub Copilot
23,177
star
3

docs

The open-source repo for docs.github.com
JavaScript
14,053
star
4

opensource.guide

πŸ“š Community guides for open source creators
HTML
12,947
star
5

linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
Ruby
10,684
star
6

semantic

Parsing, analyzing, and comparing source code across many languages
Haskell
8,827
star
7

copilot.vim

Neovim plugin for GitHub Copilot
Vim Script
7,500
star
8

roadmap

GitHub public roadmap
7,393
star
9

scientist

πŸ”¬ A Ruby library for carefully refactoring critical paths.
Ruby
7,295
star
10

personal-website

Code that'll help you kickstart a personal website that showcases your work as a software developer.
HTML
7,243
star
11

codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
CodeQL
7,092
star
12

markup

Determines which markup library to use to render a content file (e.g. README) on GitHub
Ruby
5,678
star
13

dmca

Repository with text of DMCA takedown notices as received. GitHub does not endorse or adopt any assertion contained in the following notices. Users identified in the notices are presumed innocent until proven guilty. Additional information about our DMCA policy can be found at
DIGITAL Command Language
5,312
star
14

swift-style-guide

**Archived** Style guide & coding conventions for Swift projects
4,770
star
15

gemoji

Emoji images and names.
Ruby
4,280
star
16

training-kit

Open source courseware for Git and GitHub
HTML
4,125
star
17

explore

Community-curated topic and collection pages on GitHub
Ruby
3,840
star
18

hubot-scripts

DEPRECATED, see https://github.com/github/hubot-scripts/issues/1113 for details - optional scripts for hubot, opt in via hubot-scripts.json
CoffeeScript
3,538
star
19

mona-sans

Mona Sans, a variable font from GitHub
3,379
star
20

choosealicense.com

A site to provide non-judgmental guidance on choosing a license for your open source project
Ruby
3,379
star
21

git-sizer

Compute various size metrics for a Git repository, flagging those that might cause problems
Go
3,160
star
22

secure_headers

Manages application of security headers with many safe defaults
Ruby
3,104
star
23

gov-takedowns

Text of government takedown notices as received. GitHub does not endorse or adopt any assertion contained in the following notices.
3,033
star
24

archive-program

The GitHub Archive Program & Arctic Code Vault
2,997
star
25

scripts-to-rule-them-all

Set of boilerplate scripts describing the normalized script pattern that GitHub uses in its projects.
Shell
2,859
star
26

hotkey

Trigger an action on an element with a keyboard shortcut.
JavaScript
2,851
star
27

relative-time-element

Web component extensions to the standard <time> element.
JavaScript
2,799
star
28

janky

Continuous integration server built on top of Jenkins and Hubot
Ruby
2,757
star
29

github-elements

GitHub's Web Component collection.
JavaScript
2,523
star
30

renaming

Guidance for changing the default branch name for GitHub repositories
2,383
star
31

view_component

A framework for building reusable, testable & encapsulated view components in Ruby on Rails.
Ruby
2,370
star
32

VisualStudio

GitHub Extension for Visual Studio
C#
2,349
star
33

glb-director

GitHub Load Balancer Director and supporting tooling.
C
2,255
star
34

SoftU2F

Software U2F authenticator for macOS
Swift
2,201
star
35

accessibilityjs

Client side accessibility error scanner.
JavaScript
2,180
star
36

balanced-employee-ip-agreement

GitHub's employee intellectual property agreement, open sourced and reusable
2,105
star
37

CodeSearchNet

Datasets, tools, and benchmarks for representation learning of code.
Jupyter Notebook
2,078
star
38

github-services

Legacy GitHub Services Integration
Ruby
1,902
star
39

platform-samples

A public place for all platform sample projects.
Shell
1,851
star
40

pages-gem

A simple Ruby Gem to bootstrap dependencies for setting up and maintaining a local Jekyll environment in sync with GitHub Pages
Ruby
1,782
star
41

hubot-sans

Hubot Sans, a variable font from GitHub
1,754
star
42

india

GitHub resources and information for the developer community in India
Ruby
1,749
star
43

objective-c-style-guide

**Archived** Style guide & coding conventions for Objective-C projects
1,682
star
44

government.github.com

Gather, curate, and feature stories of public servants and civic hackers using GitHub as part of their open government innovations
HTML
1,670
star
45

site-policy

Collaborative development on GitHub's site policies, procedures, and guidelines
1,652
star
46

covid19-dashboard

A site that displays up to date COVID-19 stats, powered by fastpages.
Jupyter Notebook
1,644
star
47

advisory-database

Security vulnerability database inclusive of CVEs and GitHub originated security advisories from the world of open source software.
1,595
star
48

haikus-for-codespaces

EJS
1,550
star
49

lightcrawler

Crawl a website and run it through Google lighthouse
JavaScript
1,471
star
50

feedback

Public feedback discussions for: GitHub for Mobile, GitHub Discussions, GitHub Codespaces, GitHub Sponsors, GitHub Issues and more!
1,359
star
51

developer.github.com

GitHub Developer site
Ruby
1,314
star
52

rest-api-description

An OpenAPI description for GitHub's REST API
1,304
star
53

brubeck

A Statsd-compatible metrics aggregator
C
1,185
star
54

catalyst

Catalyst is a set of patterns and techniques for developing components within a complex application.
TypeScript
1,183
star
55

backup-utils

GitHub Enterprise Backup Utilities
Shell
1,167
star
56

securitylab

Resources related to GitHub Security Lab
C
1,150
star
57

opensourcefriday

🚲 Contribute to the open source community every Friday
HTML
1,143
star
58

graphql-client

A Ruby library for declaring, composing and executing GraphQL queries
Ruby
1,139
star
59

Rebel

Cocoa framework for improving AppKit
Objective-C
1,127
star
60

dev

Press the . key on any repo
1,085
star
61

codeql-action

Actions for running CodeQL analysis
TypeScript
1,015
star
62

gh-actions-importer

GitHub Actions Importer helps you plan and automate the migration of Azure DevOps, Bamboo, Bitbucket, CircleCI, GitLab, Jenkins, and Travis CI pipelines to GitHub Actions.
C#
949
star
63

licensed

A Ruby gem to cache and verify the licenses of dependencies
Ruby
942
star
64

.github

Community health files for the @GitHub organization
795
star
65

swordfish

EXPERIMENTAL password management app. Don't use this.
Ruby
740
star
66

details-dialog-element

A modal dialog that's opened with <details>.
JavaScript
739
star
67

github-ds

A collection of Ruby libraries for working with SQL on top of ActiveRecord's connection
Ruby
667
star
68

vulcanizer

GitHub's ops focused Elasticsearch library
Go
657
star
69

codeql-cli-binaries

Binaries for the CodeQL CLI
657
star
70

email_reply_parser

Small library to parse plain text email content
Ruby
646
star
71

webauthn-json

πŸ” A small WebAuthn API wrapper that translates to/from pure JSON using base64url.
TypeScript
638
star
72

stack-graphs

Rust implementation of stack graphs
Rust
626
star
73

rubocop-github

Code style checking for GitHub's Ruby projects
Ruby
616
star
74

github-ospo

Helping open source program offices get started
599
star
75

dat-science

Replaced by https://github.com/github/scientist
Ruby
582
star
76

maven-plugins

Official GitHub Maven Plugins
Java
581
star
77

details-menu-element

A menu opened with <details>.
JavaScript
554
star
78

trilogy

Trilogy is a client library for MySQL-compatible database servers, designed for performance, flexibility, and ease of embedding.
C
543
star
79

freno

freno: cooperative, highly available throttler service
Go
534
star
80

smimesign

An S/MIME signing utility for use with Git
Go
519
star
81

codespaces-jupyter

Explore machine learning and data science with Codespaces
Jupyter Notebook
518
star
82

gh-valet

Valet helps facilitate the migration of Azure DevOps, CircleCI, GitLab CI, Jenkins, and Travis CI pipelines to GitHub Actions.
C#
513
star
83

include-fragment-element

A client-side includes tag.
JavaScript
508
star
84

safe-settings

JavaScript
505
star
85

covid-19-repo-data

Data archive of identifiable COVID-19 related public projects on GitHub
491
star
86

Archimedes

Geometry functions for Cocoa and Cocoa Touch
Objective-C
466
star
87

codeql-go

The CodeQL extractor and libraries for Go.
462
star
88

vscode-github-actions

GitHub Actions extension for VS Code
TypeScript
443
star
89

vscode-codeql-starter

Starter workspace to use with the CodeQL extension for Visual Studio Code.
CodeQL
441
star
90

open-source-survey

The Open Source Survey
431
star
91

how-engineering-communicates

A community version of the "common API" for how the GitHub Engineering organization communicates
431
star
92

synsanity

netfilter (iptables) target for high performance lockless SYN cookies for SYN flood mitigation
C
424
star
93

brasil

Recursos e informaçáes do GitHub para a comunidade de desenvolvedores no Brasil.
Ruby
418
star
94

entitlements-app

The Ruby Gem that Powers Entitlements - GitHub's Identity and Access Management System
Ruby
393
star
95

gh-copilot

Ask for assistance right in your terminal.
383
star
96

roskomnadzor

deprecated archive β€” moved to https://github.com/github/gov-takedowns/tree/master/Russia
376
star
97

clipboard-copy-element

Copy element text content or input values to the clipboard.
JavaScript
374
star
98

MVG

MVG = Minimum Viable Governance
364
star
99

pycon2011

Python
353
star
100

vscode-codeql

An extension for Visual Studio Code that adds rich language support for CodeQL
TypeScript
349
star