• Stars
    star
    2,209
  • Rank 20,887 (Top 0.5 %)
  • Language
    Shell
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A sidecar app which clones a git repo and keeps it in sync with the upstream.

NOTE: THIS IS THE DEVELOPMENT BRANCH

This document is the "master" branch, which is under active development. If you are looking for docs on released versions of git-sync, you probably want to use the v3.x branch.

git-sync

git-sync is a simple command that pulls a git repository into a local directory, waits for a while, then repeats. As the remote repository chan ges, those changes will be synced locally. It is a perfect "sidecar" container in Kubernetes - it can pull files down from a repository so that an application can consume them.

git-sync can pull one time, or on a regular interval. It can pull from the HEAD of a branch, from a git tag, or from a specific git hash. It will only re-pull if the referenced target has changed in the upstream repository (e.g. a new commit on a branch). It "publishes" each sync through a worktree and a named symlink. This ensures an atomic update - consumers will not see a partially constructed view of the local repository.

git-sync can pull over HTTP(S) (with authentication or not) or SSH.

git-sync can also be configured to make a webhook call or exec a command upon successful git repo synchronization. The call is made after the symlink is updated.

Major update: v3.x -> v4.x

git-sync has undergone many significant changes between v3.x and v4.x. See here for more details.

Building it

We use docker buildx to build images.

# build the container
make container REGISTRY=registry VERSION=tag
# build the container behind a proxy
make container REGISTRY=registry VERSION=tag \
    HTTP_PROXY=http://<proxy_address>:<proxy_port> \
    HTTPS_PROXY=https://<proxy_address>:<proxy_port>
# build the container for an OS/arch other than the current (e.g. you are on
# MacOS and want to run on Linux)
make container REGISTRY=registry VERSION=tag \
    GOOS=linux GOARCH=amd64

Usage

# make a directory (owned by you) for the volume
export DIR="/tmp/git-data"
mkdir -p $DIR

# run the container (as your own UID)

# run the container
docker run -d \
    -v $DIR:/tmp/git \
    -u$(id -u):$(id -g) \
    registry/git-sync:tag \
        --repo=https://github.com/kubernetes/git-sync \
        --root=/tmp/git/root \
        --period=30s

# run an nginx container to serve the content
docker run -d \
    -p 8080:80 \
    -v $DIR:/usr/share/nginx/html \
    nginx

Flags

git-sync has many flags and optional features (see the manual below). Most of those flags can be configured through environment variables, but in most cases (with the obvious exception of passwords) flags are preferred, because the program can abort if an invalid flag is specified, but a misspelled environment variable will just be ignored. We've tried to stay backwards-compatible across major versions (by accepting deprecated flags and environment variables), but some things have evolved, and users are encouraged to use the most recent flags for their major verion.

Volumes

The --root flag must indicate either a directory that either a) does not exist (it will be created); or b) exists and is empty; or c) can be emptied by removing all of the contents.

Why? Git really wants an empty directory, to avoid any confusion. If the directory exists and is not empty, git-sync will try to empty it by removing everything in it (we can't just rm -rf the dir because it might be a mounted volume). If that fails, git-sync will abort.

With the above example or with a Kubernetes emptyDir, there is usually no problem. The problematic case is when the volume is the root of a filesystem, which sometimes contains metadata (e.g. ext{2,3,4} have a lost+found dir). The only real solution is to use a sub-directory of the volume as the --root.

More docs

More documentation on specific topics can be found here.

Manual

GIT-SYNC

NAME
    git-sync - sync a remote git repository

SYNOPSIS
    git-sync --repo=<repo> --root=<path> [OPTIONS]...

DESCRIPTION

    Fetch a remote git repository to a local directory, poll the remote for
    changes, and update the local copy.

    This is a perfect "sidecar" container in Kubernetes.  For example, it can
    periodically pull files down from a repository so that an application can
    consume them.

    git-sync can pull one time, or on a regular interval.  It can read from the
    HEAD of a branch, from a git tag, or from a specific git hash.  It will only
    re-pull if the target has changed in the remote repository.  When it
    re-pulls, it updates the destination directory atomically.  In order to do
    this, it uses a git worktree in a subdirectory of the --root and flips a
    symlink.

    git-sync can pull over HTTP(S) (with authentication or not) or SSH.

    git-sync can also be configured to make a webhook call upon successful git
    repo synchronization.  The call is made after the symlink is updated.

OPTIONS

    Many options can be specified as either a commandline flag or an environment
    variable, but flags are preferred because a misspelled flag is a fatal
    error while a misspelled environment variable is silently ignored.

    --add-user, $GITSYNC_ADD_USER
            Add a record to /etc/passwd for the current UID/GID.  This is
            needed to use SSH with an arbitrary UID (see --ssh).  This assumes
            that /etc/passwd is writable by the current UID.

    --askpass-url <string>, $GITSYNC_ASKPASS_URL
            A URL to query for git credentials.  The query must return success
            (200) and produce a series of key=value lines, including
            "username=<value>" and "password=<value>".

    --cookie-file <string>, $GITSYNC_COOKIE_FILE
            Use a git cookiefile (/etc/git-secret/cookie_file) for
            authentication.

    --depth <int>, $GITSYNC_DEPTH
            Create a shallow clone with history truncated to the specified
            number of commits.  If not specified, this defaults to syncing a
            single commit.  Setting this to 0 will sync the full history of the
            repo.

    --error-file <string>, $GITSYNC_ERROR_FILE
            The path to an optional file into which errors will be written.
            This may be an absolute path or a relative path, in which case it
            is relative to --root.

    --exechook-backoff <duration>, $GITSYNC_EXECHOOK_BACKOFF
            The time to wait before retrying a failed --exechook-command.  If
            not specified, this defaults to 3 seconds ("3s").

    --exechook-command <string>, $GITSYNC_EXECHOOK_COMMAND
            An optional command to be executed after syncing a new hash of the
            remote repository.  This command does not take any arguments and
            executes with the synced repo as its working directory.  The following
            environment variables $GITSYNC_HASH will be set to the git hash that
            was synced.  The execution is subject to the overall --sync-timeout
            flag and will extend the effective period between sync attempts.
            This flag obsoletes --sync-hook-command, but if sync-hook-command
            is specified, it will take precedence.

    --exechook-timeout <duration>, $GITSYNC_EXECHOOK_TIMEOUT
            The timeout for the --exechook-command.  If not specifid, this
            defaults to 30 seconds ("30s").

    --git <string>, $GITSYNC_GIT
            The git command to run (subject to PATH search, mostly for
            testing).  This defaults to "git".

    --git-config <string>, $GITSYNC_GIT_CONFIG
            Additional git config options in a comma-separated 'key:val'
            format.  The parsed keys and values are passed to 'git config' and
            must be valid syntax for that command.

            Both keys and values can be either quoted or unquoted strings.
            Within quoted keys and all values (quoted or not), the following
            escape sequences are supported:
                '\n' => [newline]
                '\t' => [tab]
                '\"' => '"'
                '\,' => ','
                '\\' => '\'
            To include a colon within a key (e.g. a URL) the key must be
            quoted.  Within unquoted values commas must be escaped.  Within
            quoted values commas may be escaped, but are not required to be.
            Any other escape sequence is an error.

    --git-gc <string>, $GITSYNC_GIT_GC
            The git garbage collection behavior: one of "auto", "always",
            "aggressive", or "off".  If not specified, this defaults to
            "auto".

            - auto: Run "git gc --auto" once per successful sync.  This mode
              respects git's gc.* config params.
            - always: Run "git gc" once per successful sync.
            - aggressive: Run "git gc --aggressive" once per successful sync.
              This mode can be slow and may require a longer --sync-timeout value.
            - off: Disable explicit git garbage collection, which may be a good
              fit when also using --one-time.

    --group-write, $GITSYNC_GROUP_WRITE
            Ensure that data written to disk (including the git repo metadata,
            checked out files, worktrees, and symlink) are all group writable.
            This corresponds to git's notion of a "shared repository".  This is
            useful in cases where data produced by git-sync is used by a
            different UID.  This replaces the older --change-permissions flag.

    -h, --help
            Print help text and exit.

    --http-bind <string>, $GITSYNC_HTTP_BIND
            The bind address (including port) for git-sync's HTTP endpoint.  If
            not specified, the HTTP endpoint is not enabled.

            Examples:
              ":1234": listen on any IP, port 1234
              "127.0.0.1:1234": listen on localhost, port 1234

    --http-metrics, $GITSYNC_HTTP_METRICS
            Enable metrics on git-sync's HTTP endpoint.  Requires --http-bind
            to be specified.

    --http-pprof, $GITSYNC_HTTP_PPROF
            Enable the pprof debug endpoints on git-sync's HTTP endpoint.
            Requires --http-bind to be specified.

    --link <string>, $GITSYNC_LINK
            The path to at which to create a symlink which points to the
            current git directory, at the currently synced hash.  This may be
            an absolute path or a relative path, in which case it is relative
            to --root.  Consumers of the synced files should always use this
            link - it is updated atomically and should always be valid.  The
            basename of the target of the link is the current hash.  If not
            specified, this defaults to the leaf dir of --repo.

    --man
            Print this manual and exit.

    --max-failures <int>, $GITSYNC_MAX_FAILURES
            The number of consecutive failures allowed before aborting (the
            first sync must succeed), Setting this to a negative value will
            retry forever after the initial sync.  If not specified, this
            defaults to 0, meaning any sync failure will terminate git-sync.

    --one-time, $GITSYNC_ONE_TIME
            Exit after one sync.

    --password <string>, $GITSYNC_PASSWORD
            The password or personal access token (see github docs) to use for
            git authentication (see --username).  NOTE: for security reasons,
            users should prefer --password-file or $GITSYNC_PASSWORD_FILE for
            specifying the password.

    --password-file <string>, $GITSYNC_PASSWORD_FILE
            The file from which the password or personal access token (see
            github docs) to use for git authentication (see --username) will be
            read.

    --period <duration>, $GITSYNC_PERIOD
            How long to wait between sync attempts.  This must be at least
            10ms.  This flag obsoletes --wait, but if --wait is specified, it
            will take precedence.  If not specified, this defaults to 10
            seconds ("10s").

    --ref <string>, $GITSYNC_REF
            The git revision (branch, tag, or hash) to check out.  If not
            specified, this defaults to "HEAD" (of the upstream repo's default
            branch).

    --repo <string>, $GITSYNC_REPO
            The git repository to sync.  This flag is required.

    --root <string>, $GITSYNC_ROOT
            The root directory for git-sync operations, under which --link will
            be created.  This must be a path that either a) does not exist (it
            will be created); b) is an empty directory; or c) is a directory
            which can be emptied by removing all of the contents.  This flag is
            required.

    --sparse-checkout-file <string>, $GITSYNC_SPARSE_CHECKOUT_FILE
            The path to a git sparse-checkout file (see git documentation for
            details) which controls which files and directories will be checked
            out.  If not specified, the default is to check out the entire repo.

    --ssh, $GITSYNC_SSH
            Use SSH for git authentication and operations.

    --ssh-key-file <string>, $GITSYNC_SSH_KEY_FILE
            The SSH key to use when using --ssh.  If not specified, this
            defaults to "/etc/git-secret/ssh".

    --ssh-known-hosts, $GITSYNC_SSH_KNOWN_HOSTS
            Enable SSH known_hosts verification when using --ssh.  If not
            specified, this defaults to true.

    --ssh-known-hosts-file <string>, $GITSYNC_SSH_KNOWN_HOSTS_FILE
            The known_hosts file to use when --ssh-known-hosts is specified.
            If not specified, this defaults to "/etc/git-secret/known_hosts".

    --submodules <string>, $GITSYNC_SUBMODULES
            The git submodule behavior: one of "recursive", "shallow", or
            "off".  If not specified, this defaults to "recursive".

    --sync-on-signal <string>, $GITSYNC_SYNC_ON_SIGNAL
            Indicates that a sync attempt should occur upon receipt of the
            specified signal name (e.g. SIGHUP) or number (e.g. 1). If a sync
            is already in progress, another sync will be triggered as soon as
            the current one completes. If not specified, signals will not
            trigger syncs.

    --sync-timeout <duration>, $GITSYNC_SYNC_TIMEOUT
            The total time allowed for one complete sync.  This must be at least
            10ms.  This flag obsoletes --timeout, but if --timeout is specified,
            it will take precedence.  If not specified, this defaults to 120
            seconds ("120s").

    --touch-file <string>, $GITSYNC_TOUCH_FILE
            The path to an optional file which will be touched whenever a sync
            completes.  This may be an absolute path or a relative path, in
            which case it is relative to --root.

    --username <string>, $GITSYNC_USERNAME
            The username to use for git authentication (see --password-file or
            --password).

    -v, --verbose <int>
            Set the log verbosity level.  Logs at this level and lower will be
            printed.

    --version
            Print the version and exit.

    --webhook-backoff <duration>, $GITSYNC_WEBHOOK_BACKOFF
            The time to wait before retrying a failed --webhook-url.  If not
            specified, this defaults to 3 seconds ("3s").

    --webhook-method <string>, $GITSYNC_WEBHOOK_METHOD
            The HTTP method for the --webhook-url.  If not specified, this defaults to "POST".

    --webhook-success-status <int>, $GITSYNC_WEBHOOK_SUCCESS_STATUS
            The HTTP status code indicating a successful --webhook-url.  Setting
            this to 0 disables success checks, which makes webhooks
            "fire-and-forget".  If not specified, this defaults to 200.

    --webhook-timeout <duration>, $GITSYNC_WEBHOOK_TIMEOUT
            The timeout for the --webhook-url.  If not specified, this defaults
            to 1 second ("1s").

    --webhook-url <string>, $GITSYNC_WEBHOOK_URL
            A URL for optional webhook notifications when syncs complete.  The
            header 'Gitsync-Hash' will be set to the git hash that was synced.

EXAMPLE USAGE

    git-sync \
        --repo=https://github.com/kubernetes/git-sync \
        --ref=HEAD \
        --period=10s \
        --root=/mnt/git

AUTHENTICATION

    Git-sync offers several authentication options to choose from.  If none of
    the following are specified, git-sync will try to access the repo in the
    "natural" manner.  For example, "https://repo" will try to use plain HTTPS
    and "[email protected]:repo" will try to use SSH.

    username/password
            The --username (GITSYNC_USERNAME) and --password-file
            (GITSYNC_PASSWORD_FILE) or --password (GITSYNC_PASSWORD) flags
            will be used.  To prevent password leaks, the --password-file flag
            or GITSYNC_PASSWORD environment variable is almost always
            preferred to the --password flag.

            A variant of this is --askpass-url (GITSYNC_ASKPASS_URL), which
            consults a URL (e.g. http://metadata) to get credentials on each
            sync.

    SSH
            When --ssh (GITSYNC_SSH) is specified, the --ssh-key-file
            (GITSYNC_SSH_KEY_FILE) will be used.  Users are strongly advised
            to also use --ssh-known-hosts (GITSYNC_SSH_KNOWN_HOSTS) and
            --ssh-known-hosts-file (GITSYNC_SSH_KNOWN_HOSTS_FILE) when using
            SSH.

    cookies
            When --cookie-file (GITSYNC_COOKIE_FILE) is specified, the
            associated cookies can contain authentication information.

HOOKS

    Webhooks and exechooks are executed asynchronously from the main git-sync
    process.  If a --webhook-url or --exechook-command is configured, whenever
    a new hash is synced the hook(s) will be invoked.  For exechook, that means
    the command is exec()'ed, and for webhooks that means an HTTP request is
    sent using the method defined in --webhook-method.  Git-sync will retry
    both forms of hooks until they succeed (exit code 0 for exechooks, or
    --webhook-success-status for webhooks).  If unsuccessful, git-sync will
    wait --exechook-backoff or --webhook-backoff (as appropriate) before
    re-trying the hook.

    Hooks are not guaranteed to succeed on every single hash change.  For example,
    if a hook fails and a new hash is synced during the backoff period, the
    retried hook will fire for the newest hash.

More Repositories

1

kubernetes

Production-Grade Container Scheduling and Management
Go
109,583
star
2

minikube

Run Kubernetes locally
Go
29,215
star
3

ingress-nginx

Ingress-NGINX Controller for Kubernetes
Go
17,204
star
4

kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
Go
15,806
star
5

dashboard

General-purpose web UI for Kubernetes clusters
Go
14,250
star
6

community

Kubernetes community content
Jupyter Notebook
11,899
star
7

kompose

Convert Compose to Kubernetes
Go
9,453
star
8

client-go

Go client for Kubernetes.
Go
8,908
star
9

autoscaler

Autoscaling components for Kubernetes
Go
8,043
star
10

examples

Kubernetes application example tutorials
Shell
6,148
star
11

kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
Go
5,313
star
12

website

Kubernetes website and documentation repo:
HTML
4,437
star
13

test-infra

Test infrastructure for the Kubernetes project.
Go
3,817
star
14

kubeadm

Aggregator for issues filed against kubeadm
Go
3,728
star
15

enhancements

Enhancements tracking repo for Kubernetes
Go
3,380
star
16

sample-controller

Repository for sample controller. Complements sample-apiserver
Go
3,129
star
17

node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes.
Go
2,892
star
18

kubectl

Issue tracker and mirror of kubectl code
Go
2,811
star
19

code-generator

Generators for kube-like API types
Go
1,692
star
20

ingress-gce

Ingress controller for Google Cloud
Go
1,269
star
21

dns

Kubernetes DNS service
Go
911
star
22

perf-tests

Performance tests and benchmarks
Go
883
star
23

apimachinery

Go
817
star
24

k8s.io

Code and configuration to manage Kubernetes project infrastructure, including various *.k8s.io sites
HCL
701
star
25

api

The canonical location of the Kubernetes API definition.
Go
647
star
26

apiserver

Library for writing a Kubernetes-style API server.
Go
644
star
27

cloud-provider-openstack

Go
612
star
28

gengo

gengo library for code generation.
Go
548
star
29

sig-release

Repo for SIG release
Shell
534
star
30

sample-apiserver

Reference implementation of an apiserver for a custom Kubernetes API.
Go
527
star
31

metrics

Kubernetes metrics-related API types and clients
Go
489
star
32

release

Release infrastructure for Kubernetes and related components
Go
484
star
33

design-proposals-archive

Archive of Kubernetes Design Proposals
Makefile
478
star
34

registry.k8s.io

This project is the repo for registry.k8s.io, the production OCI registry service for Kubernetes' container image artifacts
Go
385
star
35

cloud-provider-aws

Cloud provider for AWS
Go
382
star
36

cri-api

Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.
Go
376
star
37

cloud-provider-alibaba-cloud

CloudProvider for Alibaba Cloud
Go
358
star
38

utils

Non-Kubernetes-specific utility libraries which are consumed by multiple projects.
Go
326
star
39

kube-openapi

Kubernetes OpenAPI spec generation & serving
Go
315
star
40

kubelet

kubelet component configs
Go
307
star
41

sample-cli-plugin

Sample kubectl plugin
Go
285
star
42

cli-runtime

Set of helpers for creating kubectl commands and plugins.
Go
282
star
43

kube-aggregator

Aggregator for Kubernetes-style API servers: dynamic registration, discovery summarization, secure proxy
Go
249
star
44

cloud-provider

cloud-provider defines the shared interfaces which Kubernetes cloud providers implement. These interfaces allow various controllers to integrate with any cloud provider in a pluggable fashion. Also serves as an issue tracker for SIG Cloud Provider.
Go
243
star
45

org

Meta configuration for Kubernetes Github Org
Go
242
star
46

cloud-provider-vsphere

Kubernetes Cloud Provider for vSphere https://cloud-provider-vsphere.sigs.k8s.io
Go
238
star
47

apiextensions-apiserver

API server for API extensions like CustomResourceDefinitions
Go
231
star
48

kubernetes-template-project

A template for starting new projects on the github.com/kubernetes organization
188
star
49

kube-proxy

kube-proxy component configs
Go
178
star
50

sig-security

Process documentation, non-code deliverables, and miscellaneous artifacts of Kubernetes SIG Security
Python
166
star
51

committee-security-response

Kubernetes Security Process and Security Committee docs
Python
163
star
52

kube-scheduler

kube-scheduler component configs
Go
162
star
53

cloud-provider-gcp

cloud-provider-gcp contains several projects used to run Kubernetes in Google Cloud
Go
115
star
54

component-base

Shared code for kubernetes core components
Go
106
star
55

repo-infra

Kubernetes repository infrastucture tools
Starlark
97
star
56

pod-security-admission

Kubernetes Pod Security Standards implementation - https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/2579-psp-replacement/README.md
Go
97
star
57

kube-controller-manager

kube-controller-manager component configs
Go
88
star
58

steering

The Kubernetes Steering Committee
83
star
59

publishing-bot

Code behind the robot to publish from staging to real repositories.
Go
82
star
60

controller-manager

This repo is intended to contain common public library code for kube-controller-manager, cloud-controller-manager as well as any other controller managers which people build.
Go
68
star
61

contributor-site

Code for kubernetes.dev
HTML
66
star
62

mount-utils

Package mount defines an interface to mounting filesystems.
Go
56
star
63

legacy-cloud-providers

This repository hosts the legacy in-tree cloud providers. Out-of-tree cloud providers can consume packages in this repo to support legacy implementations of their Kubernetes cloud provider.
Go
51
star
64

system-validators

A set of system-oriented validators for kubeadm preflight checks.
Go
34
star
65

cluster-bootstrap

Go
31
star
66

dynamic-resource-allocation

Go
23
star
67

cloud-provider-sample

Sample of how to build a cloud provider repo. This will build a Kubernetes image which deploys on bare metal. It uses the fake cloud provider. It consumes the K8s/K8s build artifact and adds to it the Cloud Controller Manager and CSI Daemon Set.
21
star
68

kms

Kubernetes KMS implementation
Go
18
star
69

node-api

Go
14
star
70

component-helpers

High-level helpers for Kubernetes components
Go
13
star
71

csi-translation-lib

Staging repo for CSI Migration/Translation libraries
Go
12
star
72

cel-admission-webhook

Go
11
star
73

endpointslice

Go
6
star
74

sig-testing

Home for SIG Testing discussion and documents.
6
star
75

cri-client

Container Runtime Interface client implementation
Go
3
star
76

.github

Default files for all repos in the Kubernetes GitHub org
1
star