Backing up a living system

The tools and strategies you use to back up files that are not being accessed won’t work when you copy data that is currently in use by a busy application. This article explains the danger of employing common Linux utilities to back up living data and examines some alternatives.

Tools to make backups of your files are plentiful, and the Internet is full of tutorials explaining how to use them. Unfortunately, most entry-level blogs and articles assume the user just wants to back up a small set of data that remains static throughout the backup process.

Such an assumption is an acceptable approximation in most cases. After all, a user who is copying a folder full of horse pictures to backup storage is unlikely to open an image editor and start modifying the files at random while they are being transferred. On the other hand, real-life scenarios often require backups to occur while the data is being modified. For instance, consider the case of a web application that is under heavy load and must continue operating during the backup.

Pitfalls of Common Tools

Traditional Unix utilities such as dd, cpio, tar, or dump are poor candidates for taking snapshots from a folder full of living data. Among these, some are worse than the others.


