Backing up

From Ganfyd

Jump to: navigation, search

There are two kinds of people: those who have lost data and those who are going to.

Anybody who has lost hours of work in a computer crash will know that the more important the data, the more important it is to have it saved and backed up.

This is easy enough to do for a few data files, but a more organised approach is need to back up an entire harddisk including system files.

Principles

Back up should be:

  • Done on a regular basis (ideally scheduled).
  • Redundant:
    • Performed in duplicate or triplicate, if data is important.
    • Stored in a geographically different place (in case of fire, theft, etc.).
  • Progressive, to avoid obsolescence (e.g. 5.25" floppy disks now unreadable in most modern computers) and also to avoid eventual device failure (e.g. an old harddisk will eventually fail).

It can be acceptable to back-up only data files if you have the original install disks for your operating system and do not have much extra software installed. Otherwise, some method of backing up your system files and software is important. Some of the software described below can backup an entire 'image' of your harddisk, including operating system and installed software. As some operating systems deteriorate with continued use, it can be useful to backup at a point when the system is stable and as performance deteriorates, you can restore to a previously stable backup point.


Contents


Methods

Online Storage

Several sites offer on-line storage. Some are free, but as in the rest of life, you get what you pay for. The more expensive sites provide more space on a reliable server with good bandwidth. However, you are at the mercy of the provider and retrieving data may be difficult if your whole system has crashed (and therefore unable to access the internet). If you are interested in backup mainly photographic images, there are many online albums which allow you to upload and print photographs.

Other sites provide a backup service for a fee (either by bandwidth or according to total storage). Many require download and installation of a small piece of software which continuously uploads data in the background to a remote web-site. The principles are similar to rsync below as only the differences between your source directory and the target directory are sent over the internet.

CD or DVD

Data can be backed up to CD or DVD (of the R or RW variety). A DVD-R can store as much as 4.7 Gb, but if using CDs, the number of discs involved will be a big hassle. There is also the difficulty of keeping track of the back-ups as there may be several different versions across several disks if you back up regularly. This approach can work if you are backing up data files only. Most of the writable and read/writeable technology is based on light or heat-sensitive dyes and there are concerns about their longevity.

USB Sticks & Keydrives

Most of the solid state drives are of limited capacity (maximum 1-2Gb, normally smaller). They are ideal for backing up data only as they are more expensive compared to other mass storage like external harddisks.

As the prices for flash memory have become more affordable, fully solid-state media devices are now available. Flash memory is often faster and physically robust than conventional harddisks. However, because each read cycle reduces the storage charge and each write cycle the life of the storage cell (well less than 100,000 writes to a memory cell with NAND technology) ideally such disks need a transactional-based file system with additional overheads to allow read-degradation monitoring and cell refresh, dynamic and static wear-leveling (e.g. moving files only ever read to areas of disk already written many times to average disk failure time) and techniques to avoid file fragmentation. Using USB sticks as additional memory with a high churn file operating system as used by default in most PC operating systems or with an asynchronous file system can be a recipe for a very short USB stick lifespan. Indeed, a filesystem designed for flash memory can extend the lifetime for a hard disk designed filesystem from days to decades.

External Harddisks

Hard disk 2-5inch.jpg

Being relatively cheap, this is the ideal storage option. Rather than installing the harddisk directly into the computer each time you want to backup, the easiest alternative is to use an external caddy to house the harddisk. Caddies are consist of a casing to house the harddisk together with electronics that allow the harddisk to connect to the computer through USB or IEE 1394 (FireWire). When connected, the computer treats the external drive as an additional harddisk. More recent devices include network support, which allow the harddisk to be attached to the a network via a router and therefore accessible by more than one person.

There are 2 main (physical) sizes of consumer harddisks:

  • 3.5 inch : Normally used as internal harddisks in desktops, they can be used externally by using a caddy. The caddies are cheap, but require an external power supply. 3.5" harddisks are bulkier and heavier, but available in very large capacities and are overall cheaper per megabyte.
  • 2.5 inch : Used in laptops. 2.5" harddisks and their caddies are slimmer and lighter, but more expensive. The advantage of the 2.5" caddies is that most are able to run using power direct from the computer (via USB or IEEE 1394).

For very important data, consider using a RAID device.

Software

The choice will depend on personal preference, operating system and cost.

Windows

Acronis True Image 
http://www.acronis.com/homecomputing/products/trueimage/
Allways Sync 
http://www.allwaysync.com/
SyncBack 
http://www.2brightsparks.com/syncback/syncback-hub.html
Genie Backup Manager 
http://www.genie-soft.com/products/gbm/default.html
Norton 
http://www.symantec.com/sabu/ghost/ghost_personal/
ImageMaker 
http://www.majorgeeks.com/ImageMaker_d3914.html
Powerquest Drive Image 
XX Copy 
http://www.xxcopy.com/index.htm
software bundled with external HDD 
e.g. Maxtor Onetouch II

Mac

The latest version of Mac OS X, Leopard, released in October 2007 includes a piece of software called Time Machine. This performs continuous, incremental backups in the background to a local hard disk (usually an external one which is dedicated for the purpose).

Other options include:

Silverkeeper 
http://www.lacie.com/silverkeeper/
iMsafe 
http://homepage.mac.com/sweetcocoa/
Carbon Copy Cloner (CCC) 
http://www.bombich.com/software/ccc.html
SuperDuper! 
http://www.shirt-pocket.com/SuperDuper

Linux

Linux distributions include tools for classic Unix backups, disk imaging, and remote encrypted transfers.

clonezilla 
http://clonezilla.sourceforge.net

This performs functions similar to Ghost and Trueimage above. It is open source. Available as a live CD.

tar 
man tar

tar is Tape ARchive and is a straightforward but versatile program for stringing files together into a single archive file to write them to a tape, or save in any other medium. Compression can be added. tar also recovers the files.

DD 
http://www.codepoets.co.uk/docs/system_imaging
http://www.mckeay.net/secure/2004/10/using_dd_to_clone_a_hd.html
rsync 
http://samba.org/rsync/
See also Wikipedia:rsync. :Incremental backups, i.e. storage of more than one version of a file, is also possible via rdiff (based on same rsync algorithm)
Unison 
http://www.cis.upenn.edu/~bcpierce/unison/

Using rsync to Perform Regular Backups

rsync is a very powerful tool, and would be the preferred option for day to day backups. See these sites for more tips:

These can be scheduled using the cron daemon. The best way to do this is to create a simple script:

#! /bin/sh

SOURCE=/path/to/sourcedir/
DESTINATION=/path/to/backupdir

echo "Running rsync to back up $SOURCE to $DESTINATION"

date
rsync -a --delete "$SOURCE" "$DESTINATION"
exit 0

Note that there should be a trailing slash after the source directory, but not after the destination directory. The script should be made executable (chmod o+x script.sh) and moved to a location where only root can access it.

Next, set up a cron job to run the script:

sudo crontab -e -u root

Edit the file:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
30 4 * * * /path/to/script.sh

This will run the backup script at 0430 every morning and email the results to the user who picks up the system mail.

- For best results, make sure the destination directory is on a different hard disk from the source.
- You can edit the SOURCE and DESTINATION variables to do a backup over the internet to a remote machine using ssh. This is beyond scope here, but read "man rsync" for details.

Speed of Backup

This depends upon the bandwidth of the connections and any overhead of the controllers and protocols used. Back up using the same controller can be slower than if you have two controllers separated by a high bandwidth bus. The time to mirror an averaged sized hard disk to another has hardly changed over the years ! A back up option using a USB 2 hard drive will run at about 10Mbytes/sec.

External Links