Using LVM to make a live copy of a BackupPC pool

“How do I backup my BackupPC pool” is perhaps the most common topic of discussion on the backuppc-users mailing list. BackupPC stores all files in a common compressed pool (cpool, although I’ll use simply “pool” for this discussion), and maintains trees of hardlinks into the pool for each backup host. Therefore BackupPC requires a Linux/Unix filesystem. If you want to back up the BackupPC server itself, you must duplicate the pool, and the hardlinks to it.

The problem with this design is that conventional duplication/archival tools like rsync and tar can’t be used to duplicate the cpool in a reasonable period of time. Any file with more than one hardlink must be tracked and restored correctly on the destination host or filesystem. Copying my 700 GB pool with rsync between two identical disks with rsync would take a minimum of 3 days. Copying the same amount of raw data would take only a few hours.

So we can’t use filesystem-aware tools to backup a BackupPC server’s pool. Instead, we must make a bit-for-bit copy of the disk image.

One suggestion I’ve seen is to make the cpool part of a RAID1 array of two disks. When you want to make a backup, pull one disk and replace it. Or make it a 3-disk mirror with a removable drive, and periodically swap it out and carry it elsewhere. This is a sound solution, but it is still sneakernet.

What I want is a solution that:

  1. Duplicates the entire BackupPC storage to a second server,
  2. in a matter of hours not days,
  3. over the network,
  4. such that the secondary server can quickly replace the primary server if it fails,
  5. without interrupting normal operations on the disk, including BackupPC itself.

To accomplish this, we build the BackupPC storage on LVM, and use LVM snapshots and netcat to copy the disk image.

My BackupPC server uses a 3ware hardware RAID card with two arrays:

  1. /dev/sda is a RAID1 of two disks, and provides the / and /var filesystems.
  2. /dev/sdb is a RAID5 array of 5 disks, and provides the physical volume for the backuppc logical volume and filesystem

The server’s filesystem looks like this:

root@backuppc:~$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.2G  1.2G  8.0G  13% /
/dev/sda3              63G   21G   41G  34% /var
/dev/mapper/volgroup-backuppc
                      1.4T  711G  644G  53% /var/lib/backuppc

If you’re planning to install BackupPC on a new server, this is easy to prepare. First, install the latest Ubuntu LTS server (I’m still using 8.04 “Hardy Heron”) on /dev/sda, using / and /var partitions as above. It is possible to configure LVM through the installer UI, but I’m not going to cover that here.

Once the base OS is installed, login via SSH and prepare the LVM setup. LVM is discussed in excellent detail elsewhere. I strongly urge you to read this How-To and play with LVM a little before building the final system.

All commands are run as root. Use sudo if you prefer.

  1. Create one partition of type 8e (Linux LVM) on the RAID disk.
    fdisk /dev/sdb

    Command (m for help): o
    Building a new DOS disklabel with disk identifier 0xf2704d76.
    Changes will remain in memory only, until you decide to write them.
    After that, of course, the previous content won't be recoverable.
    
    Command (m for help): n
    Command action
       e   extended
       p   primary partition (1-4)
    p
    Partition number (1-4): 1
    First cylinder (1-243147, default 1):
    Using default value 1
    Last cylinder, +cylinders or +size{K,M,G} (1-243147, default 243147):
    Using default value 243147
    
    Command (m for help): t
    Selected partition 1
    Hex code (type L to list codes): 8e
    Changed system type of partition 1 to 8e (Linux LVM)
    
    Command (m for help): p
    
    Disk /dev/sdb: 1999.9 GB, 1999957393408 bytes
    255 heads, 63 sectors/track, 243147 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Disk identifier: 0x92cf8945
    
    Device Boot      Start         End      Blocks   Id  System
    /dev/sdb1          1      243147  1953078277   8e  Linux LVM
    
    Command (m for help): w
    The partition table has been altered!
    
    Syncing disks.
  2. Prepare the physical volume and volume group.
    pvcreate /dev/sdb1
    vgcreate volgroup /dev/sdb1
  3. Create a logical volume which is 75% of the available space. We’ll reserve the remaining space for the snapshot.
    lvcreate -l '75%VG' -n backuppc volgroup
  4. Create a filesystem on the logical volume. Some prefer XFS, but I’m happy with ext3 (with an eye to upgrading to ext4 later).
    mkfs.ext3 /dev/volgroup/backuppc

  5. Add the new logical volume to /etc/fstab.
    vi /etc/fstab
    Add a new line. We mount with the most efficient options since BackupPC doesn’t need them.

    /dev/volgroup/backuppc /var/lib/backuppc ext3 noatime 0 2
  6. Now mount it for the first time.
    mkdir /var/lib/backuppc
    mount /var/lib/backuppc
  7. Finally, install BackupPC. This will also install Apache2, Perl, and a mail server.
    apt-get install backuppc

You now have a working BackupPC system using LVM for storage. Now we create an LVM snapshot.

Why use LVM snapshots? Because we must take a copy of the filesystem as it exists at one moment in time. If we install directly on /dev/sdb1 and then try to copy the filesystem while it is mounted, the copy will be inconsistent. Some data blocks will represent the time we started copying, others the time we ended, or anything in between. There is a disk performance penalty to using snapshots, as this creates a copy-on-write fork of the filesystem from the moment the snapshot is taken. The best practice is to create the snapshot when the server is idle, copy it to the secondary server, and then remove it immediately after.

Create the snapshot using all remaining free space.
lvcreate -l '100%free' -s -n backuppcsnapshot /dev/volgroup/backuppc

If you are curious, you can now mount the logical volume /dev/volgroup/backuppcsnapshot and compare the difference between it and the original filesystem. But we don’t need to do that. We just need to copy it to our secondary server.

The secondary server should be configured exactly like the first. At minimum, it must have a partition or logical volume that is the same size or greater than /dev/volgroup/backuppc on the primary server. My secondary server has a logical volume that is identical to the primary, with no snapshot (but has space for it, so I could copy back if I had to). To copy it, we’ll use that venerable hacker’s tool, netcat.

Using netcat to clone disks is documented elsewhere. However, you should use netcat-openbsd (version in currently in hardy is 1.89), not netcat-traditional. In my tests, only netcat-openbsd reliably closed the network connection when it reached the end of the filesystem.

apt-get install netcat-openbsd

Now we copy the data. We have to prepare to receive it first. On the secondary server, run:
netcat -l 2222 | dd of=/dev/volgroup/backuppc

Then we send it. On the primary:
dd if=/dev/volgroup/backuppcsnapshot | netcat 192.168.0.2 2222

This reads the logical volume block for block and sends it to the secondary server on 192.168.0.2, port 2222/tcp. The secondary server receives and writes it to the logical volume.

Why use dd instead of redirecting files? Because you can easily check progress:

root@backuppc:~# ps -ef | grep dd
root      4958  4495  0 18:39 pts/0    00:00:03 dd if=/dev/volgroup/backuppcsnapshot
root@backuppc:~# kill -USR1 4958

On the terminal with netcat/dd running, you’ll see this.

126689+0 records in
126689+0 records out
64864768 bytes (65 MB) copied, 9.97271 s, 6.5 MB/s

This netcat step is not secure, as anyone could send data to port 2222 on the secondary server before the connection is open, and it will be written to the disk image. Firewall this port so that the secondary will only accept a connection from the primary, and vice versa. Or you could use SSH, but the additional encryption overhead may slow the process considerably. Since SSH has built-in encryption, we can do this in one step. On the primary:

dd if=/dev/volgroup/backuppcsnapshot | ssh -C root@192.168.0.2 dd of=/dev/volgroup/backuppc

You may wish to use bzip to compress and decompress the data before and after the dd operation. Using compression will save bandwidth at the cost of CPU usage. In my tests, a raw copy gave sustained speeds of 200 mbit over gigabit Ethernet. Using bzip2, throughput was reduced to 20 mbit. Therefore I do not recommend using compression if copying over your own gigabit LAN.

As a last step, archive the BackupPC configuration whenever you clone the storage pool. Since you don’t want the secondary to start running backups, put it in a safe place. On the secondary:

mkdir /etc/backuppc/backup/

On the primary, before copying the snapshot:

rsync -av --delete /etc/backuppc/ root@192.168.0.2:/etc/backuppc/backup/

We now have a complete BackupPC server with LVM storage and a means of completely backing up to a secondary server, over the network, without interrupting the operation of the primary. But what if you have an existing BackupPC pool directly on disk partition? You’ll need to migrate it to LVM without losing the data. I’ll discuss how to do that in my next post.

Tags: , ,