I'm a bit obsessed with backing up my computer files. Over the years, I have had a couple of hard drives. Fortunately, I have only lost a small amount of irreplaceable data. I have come close to losing vast amounts of work forever.
Recognizing how luckly have been so far, I set up my own backup server. My backup mechanism has evolved incrementally to become a more complete and reliable solution. In the process I have learned a bit about system administration.
The process of setting of this backup system lead me to realize that for only a couple of users the backup box will be idle most of the time. This led me to consider ways to make a super low power version of a subversion server (pdf).
At the start I specified several requirments which my backup solution should meet. As my solution has evolved I have been able to meet more of these requirements. My current solution accomplishes all but one.
Currently the server is an old 300 MHz Dell running FreeBSD. Files are backed up via two mechanisms.
If you are not familar with version control in general, or subversion in particular, chapters 1 and 2 of the Subversion Book are excellent reading.
The mirrors of webpages and client computers are kept updated with rsync (run via shell scripts and cron job).
Hardware failures on my laptop and other computers are handled by having the backup on a dedicated server. If the client computer fails then I just recover the needed files from the backup server.
The backup data on the server is stored on a pair of drives which are in a RAID 1 (mirroring) configuration. So a failure of a single data drive does not cause a loss of the backup. Whenever discussing backup and RAID, it is worth remembering that... RAID ALONE IS NOT A BACKUP! RAID does not address data loss due to user errors or data coruption.
By being able to recover to some number of previously backed up states (not just the most recent), data can be recovered if a user changes or deletes a file by mistake. This is the point of placing files under version control. Version control software tracks the changes made to files and allows many things, including:
Again, chapters 1 and 2 of the Subversion Book is an excellent reference on the topic.
I selected subversion as my version control software. It is free, well documented, robust, and will work across almost all operating systems. Additionally, TortoiseSVN is an excellent subversion client application for Windows.Because I regularly check files in and out of the subversion repository I have some assurance that the subversion repository has not been courupted.
Additionally, there is a weekly cron job that runs the 'svnadmin verify' command to check the repository for corrupted data.
The mirroring of my webpages is handled by a daily cron job using rsync. Most of the webpages' source files are also in the subversion repository. My client computers run an rsync script on startup to syncronize their home directories with versions on the server.
The checking in and out of files to the subversion repository is NOT automated. However, once I have gotten in the habit of checking all my changes back into the repository I start to feel uncomfortable when lots of changes have accumulated which have not been checked back in (kind of like the feeling I get driving without my seatbelt on).
Encryption of the backup drives has not been implemented yet. I'm currently testing various schemes for reliability and to understated the issues and technology before I implement it.
My earliest backup solution was just an external firewire drive and rsync connected to my laptop. But this method only really met GR1.
Next versions used Suse Linux. Again, file backup was accomplished via rsync.
This setup had a couple of problems that I didn't like.
The BSDs really impressed me with their documentation and integrated/consistent approach to the whole system. The FreeBSD Handbook is very well written and complete. FreeBSD currently is the only BSD which supports Firewire drives. That was the feature that caused me to choose FreeBSD. Although I really like the paranoid approach of OpenBSD.
Copyright Roger Cortesi 2006