• Stars
    star
    561
  • Rank 79,120 (Top 2 %)
  • Language
    C
  • License
    GNU General Publi...
  • Created about 9 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

kernel module for taking block-level snapshots and incremental backups of Linux block devices

Datto Block Driver

For build and install instructions, see INSTALL.md. For information about the on-disk .datto file this module creates, see STRUCTURE.md. For instructions regarding the dbdctl tool, see dbdctl.8.md. For details on the license of the software, see LICENSING.md.

The Linux Snapshot Problem

Linux has some basic tools for creating instant copy-on-write (COW) “snapshots” of filesystems. The most prominent of these are LVM and device mapper (on which LVM is built). Unfortunately, both have limitations that render them unsuitable for supporting live server snapshotting across disparate Linux environments. Both require an unused volume to be available on the machine to track COW data. Servers, and particularly production servers, may not be preconfigured with the required spare volume. In addition, these snapshotting systems only allow a read-only volume to be made read-write. Taking a live backup requires unmounting your data volume, setting up a snapshot of it, mounting the snapshot, and then using a tool like dd or rsync to copy the original volume to a safe location. Many production servers simply cannot be brought down for the time it takes to do this and afterwards all of the new data in the COW volume must eventually be merged back in to the original volume (which requires even more downtime). This is impractical and extremely hacky, to say the least.

Datto Block Driver (Linux Kernel Module / Driver)

The Datto Block Driver (dattobd) solves the above problems and brings functionality similar to VSS on Windows to a broad range of Linux kernels. Dattobd is an open source Linux kernel module for point-in-time live snapshotting. Dattobd can be loaded onto a running Linux machine (without a reboot) and creates a COW file on the original volume representing any block device at the instant the snapshot is taken. After the first snapshot, the driver tracks incremental changes to the block device and therefore can be used to efficiently update existing backups by copying only the blocks that have changed. Dattobd is a true live-snapshotting system that will leave your root volume running and available, without requiring a reboot.

Dattobd is designed to run on any linux device from small test virtual machines to live production servers with minimal impact on I/O or CPU performance. Since the driver works at the block layer, it supports most common filesystems including ext 2,3 and 4 and xfs (although filesystems with their own block device management systems such as ZFS and BTRFS can not be supported). All COW data is tracked in a file on the source block device itself, eliminating the need to have a spare volume in order to snapshot.

Performing Incremental Backups

The primary intended use case of Dattobd is for backing up live Linux systems. The general flow is to take a snapshot, copy it and move the snapshot into 'incremental' mode. Later, we can move the incremental back to snapshot mode and efficiently update the first backup we took. We can then repeat this process to continually update our backed-up image. What follows is an example of using the driver for this purpose on a simple Ubuntu 12.04 installation with a single root volume on /dev/sda1. In this case, we are copying to another (larger) volume mounted at /backups. Other Linux distros should work similarly, with minor tweaks.

  1. Install the driver and related tools. Instructions for doing this are explained in INSTALL.md.

  2. Create a snapshot:

    dbdctl setup-snapshot /dev/sda1 /.datto 0
    

This will create a snapshot of the root volume at /dev/datto0 with a backing COW file at /.datto. This file must exist on the volume that will be snapshotted.

  1. Copy the image off the block device:

    dd if=/dev/datto0 of=/backups/sda1-bkp bs=1M
    

dd is a standard image copying tool in linux. Here it simply copies the contents of the /dev/datto0 device to an image. Be careful when running this command as it can badly corrupt filesystems if used incorrectly. NEVER execute dd with the "of" parameter pointing to a volume that has important data on it. This can take a while to copy the entire volume. See the man page on dd for more details.

  1. Put the snapshot into incremental mode:

    dbdctl transition-to-incremental 0
    

This command requests the driver to move the snapshot (/dev/datto0) to incremental mode. From this point on, the driver will only track the addresses of blocks that have changed (without the data itself). This mode is less system intensive, but is important for later when we wish to update the /backups/sda1-bkp to reflect a later snapshot of the filesystem.

  1. Continue using your system. After the initial backup, the driver will probably be left in incremental mode the vast majority of time.

  2. Move the incremental back to snapshot mode:

    dbdctl transition-to-snapshot /.datto1 0
    

This command requires the name of a new COW file to begin tracking changes again (here we chose /.datto1). At this point the driver is finished with our /.datto file we created in step 2. The /.datto file now contains a list of the blocks that have changed since our initial snapshot. We will use this in the next step to update our backed up image. It is important to not use the same file name that we specified in step 2 for this command. Otherwise, we would overwrite our list of changed blocks.

  1. Copy the changes:

    update-img /dev/datto0 /.datto /backups/sda1-bkp

Here we can use the update-img tool included with the driver. It takes 3 parameters: a snapshot (/dev/datto0), the list of changed blocks (/.datto from step 1), and an original backup image (/backups/sda1-bkp created in step 3). It copies the blocks listed in the block list from the new snapshot to the existing image, effectively updating the image.

  1. Clean up the leftover file:

    rm /.datto
    
  2. Go back to step 4 and repeat: Keep in mind it is important to specify a different COW file path for each use. If you use the same file name you will overwrite the list of changed blocks. As a result you will have to use dd to perform a full copy again instead of using the faster update-img tool (which only copies the changed blocks).

If you wish to keep multiple versions of the image, we recommend that you copy your images a snapshotting filesystem (such as BTRFS or ZFS). You can then snapshot the images after updating them (step 3 for the full backup or 7 the differential). This will allow you to keep a history of revisions to the image.

Driver Status

The current status of the dattobd driver can be read from the file /proc/datto-info. This is a JSON-formatted file with 2 fields: a version number "version" and an array of "devices". Each device has the following fields:

  • minor: The minor number of the snapshot (for identification purposes).
  • cow_file: The path to the cow file relative to the mountpoint of the block device. If the device is in an unverified state, the path is presented as it was given to the driver.
  • block_device: The block device being tracked by this device.
  • max_cache: The maximum amount of memory that may be used to cache metadata for this device (in bytes).
  • fallocate: The preallocated size of the cow file (in bytes). This will not be printed if the device is in the unverified state.
  • seq_id: The sequence id of the snapshot. This number starts at 1 for new snapshots and is incremented on each transition to snapshot.
  • uuid: Uniquely identifies a series of snapshots. It is not changed on state transition.
  • error: This field will only be present if the device has failed. It shows the linux standard error code indicating what went wrong. More specific info is printed to dmesg.
  • state: An integer representing the current working state of the device. There are 6 possible states; for more info on these refer to STRUCTURE.md.
    • 0 = dormant incremental
    • 1 = dormant snapshot
    • 2 = active incremental
    • 3 = active snapshot
    • 4 = unverified incremental
    • 5 = unverified snapshot
  • nr_changed_blocks: The number of blocks that have changed since the last snapshot.
  • version: Version of the on-disk format of the COW header.

More Repositories

1

php-json-rpc

Fully unit-tested JSON-RPC 2.0 for PHP
PHP
179
star
2

php-json-rpc-http

HTTP client and server for JSON-RPC 2.0
PHP
59
star
3

RDPMux

RDP server multiplexer designed to work with virtual machines
C++
23
star
4

fireeye-red-team-countermeasure-scanner

A scanner to detect the use of stolen FireEye red team tools
YARA
20
star
5

log4shell-tool

Log4Shell Enumeration, Mitigation and Attack Detection Tool
15
star
6

threat-management

This is where the Datto Threat Management team shares threat profiles, signatures, and information on threats that target the MSP community.
10
star
7

throughputd

network traffic monitoring tool
C
9
star
8

zfs2ceph

Convert ZFS send streams to Ceph import streams
C
9
star
9

rhel-reposync-playbook

Ansible playbook to configure RHEL mirrors
Smarty
8
star
10

mysql-failover

Config files for all components used in our lossless MySQL replication and automated failover setup
Smarty
8
star
11

php-json-rpc-ssh

SSH client and server for JSON-RPC 2.0
PHP
7
star
12

tls-config

Our best shot at doing it all with SSL
6
star
13

git-river

Tools for working with upstream repositories
Python
5
star
14

libvirt

Mirror of upstream libvirt repo with additional Datto branches
C
5
star
15

copyondemand

user space block driver that provides a bootable disk image over NFS using NBD
Go
5
star
16

php-json-rpc-auth

Authentication & authorization extension for the JSON-RPC library
PHP
4
star
17

phpunit-entropy

A PHPUnit plugin to aid randomized unit testing
PHP
4
star
18

silver-sparrow-detection-and-prevention-tool

Datto Silver Sparrow Detection and Prevention Tool
Shell
4
star
19

dns-cert-checker

A tool to check and evaluate all TLS certificates served by hosts under a user-provided list of DNS zones.
Python
4
star
20

sisyphus

Kafka -> Influx 2.x forwarder in Go
Go
4
star
21

kmod-dblock

kernel module to implement the backing data source for a block device in userspace
C
4
star
22

php-json-rpc-simple

Request-to-class mapping extension for the JSON-RPC library
PHP
4
star
23

zfs-tests

An attempt at making a broadly applicable zfs test suite
Python
3
star
24

eol-tracker

Web service for monitoring distro package and dependency versions
C++
3
star
25

php-json-rpc-validator

Validation extension for the JSON-RPC library
PHP
3
star
26

php-parser

Define and parse a context-free grammar
PHP
3
star
27

librdpmux

Library to provide low-level QEMU guest interaction capabilties outside the hypervisor. DEAD and part of rdpmux now.
C
3
star
28

es-disk-rebalance

Python
3
star
29

parameter-auto-env

PHP
2
star
30

cass_schema

A gem for managing multiple Cassandra schemas across multiple clusters.
Ruby
2
star
31

product

Product
HTML
1
star
32

zabbix-aggregate-reports

Simple tool that will aggregate data returned from Zabbix
Python
1
star
33

cassava

An un-opinionated Cassandra client built on top of the Datastax Cassandra Driver.
Ruby
1
star
34

datto.github.io

Datto's GitHub Pages site
1
star
35

hdfs-rs

Rust bindings to libhdfs
Rust
1
star
36

php-container-stack

UNMAINTAINED: Reference of Dockerfile for a base container stack for PHP apps
Dockerfile
1
star
37

pyper

Flexible pipelines for data storage and retrieval.
Ruby
1
star
38

php-json-rpc-log

Logged server extension for the JSON-RPC library
PHP
1
star
39

rebalance-core

Module implementing a simple resource rebalancing algorithm
Python
1
star