Back to Roger's Design Portfolio

Backup Server

Free BSD logo
Subversion logo

I'm a bit obsessed with backing up my computer files. Over the years, I have had a couple of hard drives. Fortunately, I have only lost a small amount of irreplaceable data. I have come close to losing vast amounts of work forever.

Recognizing how luckly have been so far, I set up my own backup server. My backup mechanism has evolved incrementally to become a more complete and reliable solution. In the process I have learned a bit about system administration.

The process of setting of this backup system lead me to realize that for only a couple of users the backup box will be idle most of the time. This led me to consider ways to make a super low power version of a subversion server (pdf).

General Requirements

At the start I specified several requirments which my backup solution should meet. As my solution has evolved I have been able to meet more of these requirements. My current solution accomplishes all but one.

  1. Allow recovery from hardware failures (both of the client computers and of the server itself).
  2. Allow recovery to some number of previously backed up states to recover from user errors.
  3. An easy method of verifying the integrity of the backup.
  4. The backup process is automated, to prevent user forgetfulness from allowing it to get out of date.
  5. The backups are encrypted to prevent a compromise of all files in the event that the computers are stolen.

Current Server Version

Currently the server is an old 300 MHz Dell running FreeBSD. Files are backed up via two mechanisms.

  1. Most of my files are maintained under version control by subversion.
  2. Large files which do not change often (music and photos) are not under version control, but rather are just copied to the server. The server maintains mirrored copies of my webpages and the client computer's home directories in this manner.

If you are not familar with version control in general, or subversion in particular, chapters 1 and 2 of the Subversion Book are excellent reading.

The mirrors of webpages and client computers are kept updated with rsync (run via shell scripts and cron job).

Current Version Requirements Specifics

GR 1: Hardware Failures

Hardware failures on my laptop and other computers are handled by having the backup on a dedicated server. If the client computer fails then I just recover the needed files from the backup server.

The backup data on the server is stored on a pair of drives which are in a RAID 1 (mirroring) configuration. So a failure of a single data drive does not cause a loss of the backup. Whenever discussing backup and RAID, it is worth remembering that... RAID ALONE IS NOT A BACKUP! RAID does not address data loss due to user errors or data coruption.

GR 2: Recovering to Some Number of Previously Backed Up States

By being able to recover to some number of previously backed up states (not just the most recent), data can be recovered if a user changes or deletes a file by mistake. This is the point of placing files under version control. Version control software tracks the changes made to files and allows many things, including:

Again, chapters 1 and 2 of the Subversion Book is an excellent reference on the topic.

I selected subversion as my version control software. It is free, well documented, robust, and will work across almost all operating systems. Additionally, TortoiseSVN is an excellent subversion client application for Windows.

GR 3: Verifying Backup Integrity

Because I regularly check files in and out of the subversion repository I have some assurance that the subversion repository has not been courupted.

Additionally, there is a weekly cron job that runs the 'svnadmin verify' command to check the repository for corrupted data.

GR 4: Automating the Backup

The mirroring of my webpages is handled by a daily cron job using rsync. Most of the webpages' source files are also in the subversion repository. My client computers run an rsync script on startup to syncronize their home directories with versions on the server.

The checking in and out of files to the subversion repository is NOT automated. However, once I have gotten in the habit of checking all my changes back into the repository I start to feel uncomfortable when lots of changes have accumulated which have not been checked back in (kind of like the feeling I get driving without my seatbelt on).

GR 5: Encrypting the Backup

Encryption of the backup drives has not been implemented yet. I'm currently testing various schemes for reliability and to understated the issues and technology before I implement it.

Earlier Backup Solutions

My earliest backup solution was just an external firewire drive and rsync connected to my laptop. But this method only really met GR1.

Next versions used Suse Linux. Again, file backup was accomplished via rsync.

This setup had a couple of problems that I didn't like.

The BSDs really impressed me with their documentation and integrated/consistent approach to the whole system. The FreeBSD Handbook is very well written and complete. FreeBSD currently is the only BSD which supports Firewire drives. That was the feature that caused me to choose FreeBSD. Although I really like the paranoid approach of OpenBSD.

Copyright Roger Cortesi 2006