Backup your web server with rsync, mysqldump and tar

Posted by Sylvester Rac on Wednesday, May 11. 2016

In this article I will demonstrate one way to backup up a Debian based web server, together with MySQL databases.

The concepts shown here should easily adapt to work on most Linux distributions.

The tools we will use include rsync, mysqldump and tar.

Lastly we will set up a cron job to schedule the task to run at the same time every day.

Backup a database using mysqldump

mysqldump is a utility provided to make light work of backing up and restoring mysql databases.

Usage (to backup):

mysqldump -u [db_username] -p[pass] [db_database] > [db_database.sql]

The above will create a backup of db_database into file db_databse.sql, using the credentials db_username/pass. Note there is no space between -p and the password. If no password is provided (-p on its own), the user will be prompted.

You may have difficulty if you try to include a path with the .sql file. In that case there is an option to specify an output path/filename on the command line:

mysqldump -u db_username -ppass db_database -r /home/fred/backups/db_forum.sql

Usage (restore):

mysql -u [db_username] -p[pass] [db_to_restore] < [backupfile.sql]

Copy data using rsync

rsync is a protocol designed for backing up and synchronising data. It can backup files to local directories or even remote servers.

To backup contents of /var/www/ to a directory named www-backup in the user's home directory:

rsync -av --delete --delete-excluded --exclude "tmp" /var/www/ ~/www-backup/

options explained:

-a (archive), a shortcut for -rlptgoD, specifies recursion, maintain symlinks, preserve file attributes (permissions, timestamps, group & owner), and preserve device/special files

-v (verbose), produce verbose output

--delete, delete files on target if they are not in source

--delete-excluded, goes one step further and delete files on target if they are not included in what's being asked to be backed up

--exclude, specifies directories to be excluded from the backup

The first time rsync runs will take the longest, as every file will need to be copied to the target. Subsequent runs are much quicker, as rsync will only copy new and changed files.

To backup to a remote server (using ssh):

rsync -av --delete --delete-excluded -e ssh /var/www/ user@hostname:/home/user/www-backup/

here, user and hostname specify the remote ssh server connection, and the part after the colon (:) specifies the target remote directory

To restore from a backup, simply reverse the target and source parameters:

rsync -av --delete --delete-excluded -e ssh user@hostname:/home/user/www-backup/ /var/www/

Create a bz2 archive

To facilitate a daily backup, we can archive the target directory into a single file, and name it according to the system date.

tar -jcvf ~/www-backup-$(date +%Y-%m-%d).tar.bz2 ~/www-backup/

the command above creates a tar.bz2 file named with the system date, eg. www-backup-2016-06-20.tar.bz2, containing the contents of ~/www-backup/

To list the contents of the tar.bz2 file:

tar -jtvf archivename.tar.bz2

To extract the contents of the tar.bz2 file (to the current directory):

tar -jxvf archivename.tar.bz2

Setup a shell script

We can write a shell script to automate the process.

#!/bin/sh

PATH=/usr/bin:/bin
LOG="www-backup-$(date +%Y-%m-%d).log"

# backup mysql database
echo "Backing up db_name to database.sql" >> ~/$LOG
mysqldump -u sql_user -psql_pass db_name -r /var/www/database.sql >> ~/$LOG 2>&1

# rsync to backup location
echo "Running rsync..." >> ~/$LOG
rsync -av --delete --delete-excluded /var/www/ ~/www-backup/ > ~/rsync.log 2> ~/rsync.err

# copy log so far into backup root
# (because we cannot include the output of tar in the .tar.bz2)
cp ~/$LOG ~/www-backup/$LOG
cp ~/rsync.log ~/www-backup/rsync.log
cp ~/rsync.err ~/www-backup/rsync.err

# archive backup to bz2 format (with timestamp)
echo "Running tar..." >> ~/$LOG
tar -jcvf ~/www-backup-$(date +%Y-%m-%d).tar.bz2 ~/www-backup/ > ~/tar.log 2> ~/tar.err

A few things to note:

The script should be run as root or another user with sufficient privilege to access the required files.
We use 2>&1 to redirect stdout and stderr to one file.
When run as a cron job, the user's login scripts are not processed, so we need to set the PATH environment variable manually.
The final output will be a tar.bz2 and a log file, both containing the system date in their filename.

Setup a cron job

Once you have tested the script and you know it's working, it's easy to setup a cron job to run periodically.

Make sure you're logged in as the user that the script will be running under.

crontab -e

The user's cron configuration file will open. The configuration file consists of 6 columns seperated by spaces. The columns are as follows:

(m): minutes past the hour
(h): hour of the day (use 24 hour time)
(dom): day of the month (1 .. 31)
(mon): month (1 .. 12)
(dow): day of the week (0 .. 6) starting Sunday
(command): command to run

Use an asterix (*) to signify any for values we don't care about.

For example, to run our script at 6:00am every Monday:

#
# m h  dom mon dow   command
#
0 6 * * 1 /root/web-backup.sh

That's it, you now have an automated, regular backup.

Dr Sly's Technical Ramblings