Backing up 4: The Raspberry Pi

Backing up 4: The Raspberry Pi

More than a year ago I described how I used Syncthing to backup folders from my NAS to an external harddrive attached to my parents PC. This was supposed to be my offline backup. Unfortunately, it didn't prove to be a very reliable solution. The PC ran Windows, I had trouble getting SSH to work reliably, I would often had to fix stuff through Teamviewer. Often the PC would not be turned on for days, so I couldn't even do the backups without asking them to turn it on. And Syncthing turned out to be finicky and sometimes didn't sync.

Then it finally dawned on me: I have two Raspberry Pi 3s at home that are just collecting dust. How about I put one of them to good use?

So I took one of the Pis, set it up at my parents place and after some fiddling, it works. Here's what I did:

I used the latest Raspbian image. It sits at my parent's home, which has a dynamic IP address. The address usually changes only if the router is restarted, but it can still cause issues. At first I thought I would set up a reverse SSH tunnel from the Raspberry Pi to my NAS, but I couldn't get autossh to work with systemd.

Then I tried another option: I set up a Dynamic DNS entry on a subdomain, with ddclient on the Raspberry Pi to update the IP address regularly. I had to open a port on my parents router for this. I added public key authentication through SSH, while restricting password based authentication to LAN networks only, in the /etc/ssh/sshd_config:

PasswordAuthentication no
ChallengeResponseAuthentication no

Match Address 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
    PasswordAuthentication yes

It has worked for two weeks, so that's pretty good.

Now that I have a stable connection to the Pi, it was time to set up the actual backups. I looked around and there are several options. I ended up choosing BorgBackup. It has builtin encryption for the archive, so I don't need to muck around with full disk encryption. It also does deduplication, compression and deltas, so after an initial full backup, it only backs up changes, so it's quite efficient.

BorgBackup is quite simple to use. First you have to initialize a repository, which will contain your backups:

 > borg init ssh://user@pi_hostname:port/path/to/backups -e authenticated

This will prompt you for a passphrase. It will also generate a keyfile, which you should export and keep safe on other machines:

> borg key export ssh://user@pi_hostname:port/path/to/backups

Then, to start the actual backup process:

> borg create --stats --progress --exclude "pattern_to_exclude*" ssh://user@pi_hostname:port/path/to/backups::archive_name ./folder1 ./folder2 ./folder3

The archive_name corresponds to one instance when you backed up everything. If the next day you rerun the command with archive_name2, it will compare all the chunks and transmit only the ones that have changed or which are new. Then you will be able to restore both archives, with BorgBackup doing the right thing in the background to show you only the items that were backed up in that archive.

The cool thing about Borg is that if a backup stops while in progress, it can easily resume at any time.

I added command to a cron job (actually, the Synology Task Scheduler) to run it daily and now I have daily, efficient backups.

#/bin/sh
# Archive name schema
DATE=$(date --iso-8601)
echo "Starting backups for $DATE"
export BORG_PASSCOMMAND="cat ~/.borg-passphrase"
/usr/local/bin/borg create --stats --exclude "pattern_to_exclude*" ssh://user@pi_hostname:port/path/to/backups::$DATE ./folder1 ./folder2 ./folder3

The .borg-passphrase file contains my passphrase and has the permission set to 400 (read only by my user). Borg then reads the passphrase from that environment variable, so no user input is necessary.

Now I get the following report by email every morning:

Duration: 4 minutes 22.54 seconds
Number of files: 281990

			Original size      Compressed size    Deduplicated size
This archive:              656.97 GB            646.90 GB             12.51 MB

Not bad. Borg sweeps 656 GB of data in 4.5 minutes, determines that there is only 13 MB of new data and sends only that over the network.

I feel much more confident about this solution than about the previous one! Here's to not changing it too often!