Duplicity is a piece of software that can perform encrypted backups to remote storage over the network. It uses the rsync algorithm to implement incremental backups, thus minimising the amount of data that needs to be transferred over the network and stored remotely. The GNU Privacy Guard is used to provide strong encryption, making it safe to keep your backups in one of the many public cloud storage solutions.

In this post, I will demonstrate basic usage of Duplicity by showing how I use it to back up the home directory on my workstation to Situla, Redpill Linpro’s S3-compatible object storage solution.

Prerequisites

First and foremost, you’ll need to install Duplicity on the host you want to back up. Duplicity is fortunately packaged in most Linux distributions, so this should just be a matter of running apt-get install duplicity or dnf install duplicity.

You’ll also need an account on Situla (or another S3-compatible object storage service), in particular the access key and the secret key. That said, Duplicity supports a large number of alternative storage backends (a full list is shown on its homepage), so adapting the examples below to your favourite storage service is probably as easy as pie.

Finally, you’ll need to generate an encryption passphrase. This should be something long and random, e.g., the output from pwgen 128 1. You must ensure you keep a copy of the encryption passphrase in a safe location well away from the system you’re backing up, otherwise your backups will be worthless the day you actually need them for disaster recovery.

Running nightly backups

The easiest way is to create a short shell script that invokes Duplicity from a nightly cron job or systemd timer. In my case, the script looks something like this:

#! /bin/sh -e

export AWS_ACCESS_KEY_ID="The Situla/S3 access key goes here"
export AWS_SECRET_ACCESS_KEY="The Situla/S3 secret key goes here"
export PASSPHRASE="The encryption passphrase goes here"

duplicity --full-if-older-than 1M \
          --s3-unencrypted-connection \
          /home/tore \
          s3://situla.bitbit.net/tore-duplicity/workstation

duplicity remove-older-than 6M \
          --s3-unencrypted-connection \
          --force \
          s3://situla.bitbit.net/tore-duplicity/workstation

This first invocation of duplicity in the script will perform a backup of my home directory /home/tore and store it in the workstation subdirectory of the tore-duplicity storage bucket in Situla. It will by default perform an incremental backup (i.e., only backing up any files that are new or changed since the last backup run), but it will automatically switch to a full backup if the last full backup is older than one month.

The second invocation will remove backups that are over six months old. This prevents my storage usage on Situla from growing without bounds.

Note that since the backup files has already been encrypted by the GNU Privacy Guard, there is not much point in spending CPU cycles on encrypting them a second time as they are transferred to Situla. Therefore I’m requesting an unencrypted connection in order to gain a small performance increase.

When the backup is complete, you’ll get a informative status summary with some statistics, as shown below. Redirecting this to a log with logger or to e-mail with sendmail is probably a useful thing to do.

--------------[ Backup Statistics ]--------------
StartTime 1480500123.35 (Wed Nov 30 11:02:03 2016)
EndTime 1480500362.49 (Wed Nov 30 11:06:02 2016)
ElapsedTime 239.13 (3 minutes 59.13 seconds)
SourceFiles 340965
SourceFileSize 37567051681 (35.0 GB)
NewFiles 26
NewFileSize 1717225 (1.64 MB)
DeletedFiles 7
ChangedFiles 24
ChangedFileSize 150673527 (144 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 57
RawDeltaSize 64155497 (61.2 MB)
TotalDestinationSizeChange 29162549 (27.8 MB)
Errors 0
-------------------------------------------------

If you’d like to take a full backup of the entire system instead of just a single directory, that is easily accomplished by replacing /home/tore with / in the first ducplicity invocation. However, it is likely that weird-behaving files in the special file systems /dev, /proc and /sys will cause problems. To avoid that, you can add --exclude /dev --exclude /proc --exclude /sys to the duplicity command line.

Checking the status and contents of the backup

Duplicity offers some handy commands that can be used to verify that everything is all right with the backup. Note that you’ll need to export the AWS_ACCESS_KEY, AWS_SECRET_ACCESS_KEY and PASSPHRASE environment variables first, exactly like in the backup script itself.

The collection-status command shows the overall status of the backups that have been made, i.e., when the last full backup was taken and how many incremental backups have been made since. Example below:

$ duplicity --s3-unencrypted-connection collection-status \
            s3://situla.bitbit.net/tore-duplicity/workstation
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Tue Nov 29 16:15:57 2016
Collection Status
-----------------
Connecting with backend: BackendWrapper
Archive dir: /home/tore/.cache/duplicity/9a6c69571a8f0d8c72abcc7b6d4c7d7c

Found 0 secondary backup chains.

Found primary backup chain with matching signature chain:
-------------------------
Chain start time: Tue Nov 29 16:15:57 2016
Chain end time: Wed Nov 30 11:02:01 2016
Number of contained backup sets: 5
Total number of contained volumes: 154
 Type of backup set:                            Time:      Num volumes:
                Full         Tue Nov 29 16:15:57 2016               148
         Incremental         Wed Nov 30 01:00:01 2016                 3
         Incremental         Wed Nov 30 10:14:33 2016                 1
         Incremental         Wed Nov 30 10:53:10 2016                 1
         Incremental         Wed Nov 30 11:02:01 2016                 1
-------------------------
No orphaned or incomplete backup sets found.

The list-current-files command will by default show a list of all the files that exists in the last backup made along with their timestamps:

$ duplicity --s3-unencrypted-connection list-current-files \
            s3://situla.bitbit.net/tore-duplicity/workstation \
            | grep bash_history
Wed Nov 30 11:02:04 2016 .bash_history

The list-current-files command accepts an optional --time parameter which can be used to specify an older backup than the most recent one. This parameter can be specified as an absolute timestamp (e.g., 2016-11-29) or an offset (e.g., 1D) that specifies how long old the requested backup should be. At the time of writing, it is the 30th of November, so the below two commands are equivalent:

$ duplicity --s3-unencrypted-connection list-current-files \
            --time 2016-11-29 \
             s3://situla.bitbit.net/tore-duplicity/workstation \
             | grep bash_history
Tue Nov 29 16:15:59 2016 .bash_history
$ duplicity --s3-unencrypted-connection list-current-files \
            --time 1D s3://situla.bitbit.net/tore-duplicity/workstation \
            | grep bash_history
Tue Nov 29 16:15:59 2016 .bash_history

Restoring from backup

Being able to restore files is clearly the single most important thing about keeping backups in the first place. Duplicity makes this very easy, it is just a matter of giving the remote backup storage location as the first command line argument and a local file system path (where the restored files will go) as the second. Some examples:

  • Restore all files contained in the most recent backup to /home/tore-restored:
    $ duplicity --s3-unencrypted-connection \
              s3://situla.bitbit.net/tore-duplicity/workstation \
              /home/tore-restored
    
  • As above, but restore a one day old backup instead of the most recent one:
    $ duplicity --s3-unencrypted-connection --time 1D \
              s3://situla.bitbit.net/tore-duplicity/workstation \
              /home/tore-restored
    
  • Only restore the single file .bash_history from the most recent backup:
    $ duplicity --s3-unencrypted-connection --file-to-restore .bash_history \
              s3://situla.bitbit.net/tore-duplicity/workstation \
              .bash_history-restored
    

The --file-to-restore parameter also accepts directories, and can of course be combined with the --time parameter in order to restore from older backups.

Duplicity does of course have have many more features than discussed in this post, and all of those are documented in its manual. That said, this post should contain everything you need to get started. Good luck, and remember: keep your encryption key in a safe place away from the system you’re backing up – one day, you’ll need it!