New version!

I released a new, simplified version of this tool which:

I'll maintain that version in favor of this and encourage you to switch. See

https://bitbucket.org/mmichele/zfssnap

The old page is maintained for the benefit of posterity.

— michele


zfsbackup – ZFS backup handling

zfsbackup is a tool for handling local and remote backups of ZFS filesystems.

zfsbackup exploits ZFS snapshots. Its main purpose is to automate the management of snapshots.

zfsbackup is written in python (have python ≥ 2.4) and released with BSD license. Download zfsbackup-0.5.tar.bz2.

Using zfsbackup

zfsbackup buils on the concept of snapshot contexts. A context is a group of snapshots:

zfsbackup allows you to do things such as:

Use case: replication of FreeBSD jails

Formulation of the problem:

Standard full backup

Here's how to use zfsbackup to take a backup of such Jail:

    # zfsbackup -c mxbackup -d /usr/jails/mx.company.com -s -k -o ~/mybackups/
    Taking snapshot 'zroot@zbk-mxbackup-29_05_2011__14_24_26'
    Restricting to: ['/usr/jails/mx.company.com']
    Excluding: None
    Destroying snapshot 'zroot@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/nobackup@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/home@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails/basejail@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails/www.company.com@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails/newjail@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails/test@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/jails/lan.company.com@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/ports@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/usr/src@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/var@zbk-mxbackup-29_05_2011__14_24_26'
    Destroying snapshot 'zroot/var/log@zbk-mxbackup-29_05_2011__14_24_26'
    Dumping datasets ['/usr/jails/mx.company.com']
    Back up 'zbk-mxbackup-29_05_2011__14_24_26'
    Exec 'zfs send -R zroot/usr/jails/mx.company.com@zbk-mxbackup-29_05_2011__14_24_26 | gzip -2 > ./backup-master_company_com-mxbackup-0-29_05_2011__14_24_36-_usr_jails_mx.company.com.zfsdump.gz'
    Run time: 114.7 sec
    Done: full dump of snapshot 'zbk-mxbackup-29_05_2011__14_24_26' into file ./backup-master_company_com-mxbackup-0-29_05_2011__14_24_36-_usr_jails_mx.company.com.zfsdump.gz
    Done.

This does the following:

And there you get the backup file:

    # ls backup-*
    backup-master_company_com-mxbackup-0-29_05_2011__14_24_36-_usr_jails_mx.company.com.zfsdump.gz

    # du -h backup-*
    775M	backup-master_company_com-mxbackup-0-29_05_2011__14_24_36-_usr_jails_mx.company.com.zfsdump.gz

You can store this file as a backup. To replicate it on any host, you run:

    # ezjail-admin stop mx.company.com      # stop the jail (if you run jails with ezjail)

    # cat backup-file.zfsdump.gz | gunzip | zfs receive -v -d zroot

zfs will recreate a dataset as at the origin: same dataset name, same data, same snapshots.

    # ezjail-admin start mx.company.com       # restart the jail with the new filesystem

If you need to restore the dump on a dataset with a different name, zfs does not allow you to do this directly. Follow the same process and finally rename the dataset and update its mountpoint.

Incremental backups

Subsequent backups can be taken incrementally. This dramatically decreases both dump time and backup size.

Incremental backup are harder to dump and restore, but zfsbackup handles most of this complexity for you.

Repeat the same command; zfsbackup will detect the previous snapshot(s) and dump incrementally from it.

    # zfsbackup -c mxbackup -d /usr/jails/mx.company.com -s -k -o ~/mybackups/

Documentation

Have a look at the --help, usage is simple enough:

# /opt/zfsbackup/zfsbackup -h
Usage: zfsbackup [options]

Options:
  -h, --help            show this help message and exit
  -c CONTEXT, --context=CONTEXT
                        your label/group/type for this snapshot
  -l, --list            list snapshots in the given context
  -n, --no-snapshot     do not take snapshot, operate (and prune) with
                        context's most recent one
  -b NUM, --backlog=NUM
                        avoid self-management: keep this many most-recent
                        snapshots of this tag (0 = infinite)
  -a MINUTES, --maxage=MINUTES
                        remove snapshots of this tag older than this many
                        minutes
  -x DS_MNTPOINT, --exclude=DS_MNTPOINT
                        exclude dataset from snapshot (omit zfspool name)
                        [repeatable]
  -d DS_MNTPOINT, --dataset=DS_MNTPOINT
                        snapshot & send this daset (recursively), not all
                        (overrides -i) [repeatable]
  -s, --send            dump/send snapshot after taking it (full or
                        incremental as appropriate)
  -0, --fulldump        perform a full dump regardless of availability of
                        former snaps
  -i DS_MNTPOINT, --dump-individually=DS_MNTPOINT
                        no root-recursion; dump this dataset individually
                        [repeatable]
  -k, --compress        compress (gzip -4) dumped files
  -t, --alternate       alternate dumps (0, 0-1, 0-2, 1-3, 2-4, 3-5, ..)
  -o DIR, --output=DIR  dump backups into such directory rather than here
  --prune-exceeding=MINUTES
                        destroy snapshots older than X minutes

Warnings

zfsbackup is released with the new-BSD license.

I make no guarantee whatsoever about the reliability of the software. The code contains some assert directives to protect the most sensitive operations (e.g. destroying snapshots, as zfs got the brilliant idea to use the same parameter to destroy snapshots and actual datasets); however, the CLI is not profusely tolerant, so expect to meet some python exception if you cross options (and not only then).

There is an underlying assumption that, once a context is initialized (first snapshot taken), successive calls on that context will have the same parameters. For example, they are run on the same datasets, and with the same backlog specifications.

I use it myself on servers in production.

If you have any bug report, request, or patch, write to the address enclosed in the source code.