Moving your Linux root partition to software RAID

One of the reasons I started this blog is to write about problems I’ve solved that I can’t find answers to already on the web. Today, let’s talk about moving your linux install to linux software raid (md raid / mdadm). This post assumes you are running Ubuntu Linux 8.04, but any Debian-based distro from the past two years, or most other distros, will have similar commands.

We start with an install on a single 80 GB SATA drive, partitioned as follows:

/dev/sda1 as /, 10GB, ext3, bootable
/dev/sda2 as swap, 1GB
/dev/sda3 as /home, 69GB, ext3

We want to add a second 80GB SATA drive and move the entire install to use RAID1 between the two drives. So the final configuration will appear:

/dev/md0 as /, 10GB, ext3
/dev/md1 as swap, 1GB
/dev/md2 as /home, 69GB, ext3

Where the raid arrays are:

md0 : active raid1 sda1[0] sdb1[1]
md1 : active raid1 sda2[0] sdb2[1]
md2 : active raid1 sda3[0] sdb3[1]

Here there be dragons. As always, back up your data first. If you don’t know how to use rsync, now is an excellent time to learn.

The general plan is:

  1. Partition the new drive
  2. Create RAID arrays and filesystems on the new drive
  3. Copy the data from drive 1 to the new RAID arrays on drive 2
  4. Install grub on drive 2
  5. Configure fstab and mdadm.conf, and rebuild initramfs images
  6. Reboot on the RAID arrays on drive 2
  7. Repartition drive 1 and add it to RAID

All commands are run as root. Use sudo if you prefer.

Step 1: Partition the new drive

Assuming you want to partition the second drive the same way as the first, this is easy. Just clone the partitions from /dev/sda to /dev/sdb:

sfdisk -d /dev/sda | sfdisk /dev/sdb

You may want to format the new disk first to clear it of old data, especially if it previously had software RAID partitions on it. You may get unusual results if it has a similar partition structure or RAID setup as the original disk.

Now use parted to mark the partitions for software RAID, with the first partition to boot:

parted /dev/sdb
(parted) toggle 1 raid
(parted) toggle 2 raid
(parted) toggle 3 raid
(parted) toggle 1 boot
(parted) print

Disk /dev/sdb: 80.0GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type     File system  Flags
 1      0.51kB  10.0GB  10.0GB  primary  ext3         boot, raid
 2      10.0GB  11.0GB  2000MB  primary  linux-swap   raid
 3      11.0GB  80.0GB  69.0GB  primary  ext3         raid
(parted) quit

Parted will show you a file system in each partition, but the reality is that they will be plain linux software raid partitions (type 0xfd):

fdisk -l /dev/sdb

Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xa3181e57

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        1216     9765625   fd  Linux RAID autodetect
/dev/sdb2            1216        1338      976562+  fd  Linux RAID autodetect
/dev/sdb3            1338        9730    67408556   fd  Linux RAID autodetect

Step 2: Create RAID arrays and filesystems on the new drive

Now we create RAID 1 arrays on each partition. These arrays will have just one member each when we create them, which isn’t normal for RAID 1. So we’ll have to force mdadm to let us:

apt-get install mdadm
mdadm --create /dev/md0 --level=1 --force --raid-devices=1 /dev/sdb1
mdadm --create /dev/md1 --level=1 --force --raid-devices=1 /dev/sdb2
mdadm --create /dev/md2 --level=1 --force --raid-devices=1 /dev/sdb3

Next, you’ll need to create filesystems on the new RAID devices. Assuming the same formats as your first partition:

mkfs.ext3 /dev/md0
mkswap /dev/md1
mkfs.ext3 /dev/md2

Step 3: Copy the data from drive 1 to the new raid arrays on drive 2

This is a job for rsync. First, there are some directories on a running Linux system that we do not want to copy, like /dev and /proc. We also want to ignore tempfs directories, like /var/run. The best way to avoid these is to make an excludes file. Create a file, /root/rsync-filter, with the following content:

- /dev/
- /lib/modules/*/volatile/
- /mnt/
- /proc/
- /sys/
- /var/lock/
- /var/run/

These lines define directories we will not copy over. You may wish to add /tmp, apt’s cache, etc, but if you do you must manually create the directories themselves on the new filesystem.

Mount the new RAID array:

mount /dev/md0 /mnt
mkdir /mnt/home
mount /dev/md2 /mnt/home

If you are using a different mount structure, just be sure to recreate it and mount it in the right places in the new filesystem under /mnt/.

And start the rsync copy:

rsync -avx --delete -n --exclude-from /root/rsync-filter / /mnt/
rsync -avx --delete -n /home/ /mnt/home/

You will see a list of files that will be changed, but nothing actually happens. This is the job of the -n argument, which performs a dry-run. Always do this before actually starting a copy. You WILL make a painful mistake with rsync some day, so learn to be cautious. Repeat the above commands without the -n when you are sure all is well.

The -x argument ensures that you will not cross filesystem boundaries, which is why you must copy /home separately, and any other mounted filesystems. If you omit this you only need one command. But make sure you have a good rsync filter file, and that you have nothing mounted like /media/cdrom that you don’t want an archive of.

Finally, create the directories that you skipped with your filter:

cd /mnt/
mkdir -p dev/ mnt/ proc/ sys/ var/lock/ var/run/
for i in /lib/modules/*/volatile ; do mkdir -p /mnt/$i ; done

Step 4: Install grub on drive 2

OK, we almost have a working RAID install on the second drive. But it won’t boot yet. Let’s use chroot to switch into it.

mount --rbind /dev /mnt/dev
mount --rbind /proc /mnt/proc
chroot /mnt

Now we have a working /dev and /proc inside the new RAID array, and by using chroot we are effectively in the root of the new array. Be absolutely sure you are in the chroot, and not the real root of drive 1. Here’s an easy trick: make sure nothing is in /sys:

ls /sys

If not, you’re in the new chroot.

Now we just need to edit grub’s device.map and menu.list files. Edit /boot/grub/device.map and make sure both drives are listed:

(hd0)   /dev/sda
(hd1)   /dev/sdb

Now get the UUID of /dev/md0′s filesystem:

tune2fs -l /dev/md0 | grep UUID

Filesystem UUID:          8f4fe480-c7ab-404e-ade8-2012333855a6

Edit /boot/grub/menu.list and find the line like this:

# kopt=root=UUID=9e299378-de65-459e-b8b5-036637b7ba93 ro

Replace the UUID with the one you just found. Leave the line commented, and save the file. Now rebuild menu.lst to use the new UUID:

update-grub

Double-check that each boot option at the bottom of menu.lst is using the right UUID. If not, edit them too. Finally, install grub on the second drive.

grub-install /dev/sdb

Step 5: Configure fstab and mdadm.conf, and rebuild initramfs images

We’re almost ready to reboot. But first, we need to build an initramfs that is capable of booting from a RAID array. Otherwise your boot process will hang at mounting the root partition. Still in the chroot, edit /etc/fstab and change the partition entries to the new filesystems or devices.

If your /etc/fstab has “UUID=” entries like the following, change them to the new entries:

proc            /proc           proc    defaults        0       0
UUID=8f4fe480-c7ab-404e-ade8-2012333855a6 /               ext3    relatime,errors=remount-ro 0       1
UUID=c1255394-4c42-430a-b9cc-aaddfb024445 none            swap    sw              0       0
UUID=94f0d5db-c525-4f52-bdce-1f93652bc0b1 /var/           ext3    relatime 0       1
/dev/scd0       /media/cdrom0   udf,iso9660 user,noauto,exec,utf8 0       0

In the above example, the first UUID cooresponds to /dev/md0, the second to /dev/md1, and so on. Find the UUIDs with:

/lib/udev/vol_id /dev/md0

If your /etc/fstab has “/dev/sda1″ entries like it’s a bit easier. Just change them to /dev/md0 and so on:

proc            /proc           proc    defaults        0       0
/dev/md0        /               ext3    relatime,errors=remount-ro 0     1
/dev/md1        none            swap    sw              0       0
/dev/md2        /var/           ext3    relatime        0       1
/dev/scd0       /media/cdrom0   udf,iso9660 user,noauto,exec,utf8 0       0

Now, while still in the chroot, edit /etc/mdadm/mdadm.conf to list each RAID array:

DEVICE partitions
ARRAY /dev/md0 UUID=126a552f:57b18c5b:65241b86:4f9faf62
ARRAY /dev/md1 UUID=ca493308:7b075f97:08347266:5b696c99
ARRAY /dev/md2 UUID=a983a59d:181b32c1:bcbb2b25:39e64cfd
MAILADDR root

Find the UUID of each RAID array, which is not the same as the UUID of the filesystem on it (!), using mdadm:

mdadm --detail /dev/md0 | grep UUID

Now, rebuild your current kernel’s initramfs image:

update-initramfs -u

Or all of them:

update-initramfs -u -k all

Step 6: Reboot on the RAID arrays on drive 2

Now we’re ready to reboot. First, exit the chroot and power off the machine cleanly. You have three options:

  1. If your BIOS allows you to select which drive to boot from, elect to boot from drive 2.
  2. Swap drives 1 and 2 so drive 2 becomes /dev/sda, and restart
  3. Use a USB recovery stick to boot from drive 2

When the system restarts, you should reboot on the new RAID drive. Make sure:

df -h

Filesystem            Size  Used Avail Use% Mounted on
/dev/md0              9.2G  1.9G  6.9G  21% /
varrun                498M   56K  498M   1% /var/run
varlock               498M     0  498M   0% /var/lock
udev                  498M   68K  498M   1% /dev
devshm                498M     0  498M   0% /dev/shm
lrm                   498M   39M  459M   8% /lib/modules/2.6.24-19-generic/volatile
/dev/md2               63G  130M   60G   1% /home

Step 7: Repartition drive 1 and add it to RAID

Finally, we add the old drive into the array. Assuming you chose option 1 and didn’t swap the drive’s cables:

sfdisk -d /dev/sdb | sfdisk --force /dev/sda

If you receive a warning message here, reboot now.

Re-reading the partition table ...
BLKRRPART: Device or resource busy
The command to re-read the partition table failed
Reboot your system now, before using mkfs

If not, continue on:

mdadm /dev/md0 --grow -n 2
mdadm /dev/md1 --grow -n 2
mdadm /dev/md2 --grow -n 2
mdadm /dev/md0 --add /dev/sda1
mdadm /dev/md1 --add /dev/sda2
mdadm /dev/md2 --add /dev/sda3

If you chose option 2 and your drives now have a different ordering, swap “/dev/sda” and “/dev/sdb” everywhere above.

The RAID array will now rebuild. To check its progress:

cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[1] sda1[0]
      9767424 blocks [2/2] [UU]

md1 : active raid1 sdb1[1] sda1[0]
      1951808 blocks [2/2] [UU]

md2 : active raid1 sdb2[2] sda2[0]
      68693824 blocks [2/1] [U_]
      [>....................]  recovery =  0.4% (16320/68693824) finish=22.4min speed=8160K/sec

unused devices:

Once the array is finished rebuilding, reinstall grub on the new drive. We’ll do both, for good measure.

grub-install /dev/sda
grub-install /dev/sdb

You should now be able to reboot without either drive, and your system will come up cleanly. If you ever need to replace a failed drive, remove it, use step 7 above to clone the partition scheme to the new drive and add it to the array.

Gotchas:

If you are having issues with your RAID setup, especially if you have an older RAID setup or older release already, you might need these packages:

apt-get install lvm2 evms dmsetup

Tags: ,

  1. Aha – that fixed it! Thanks for your post! Saved me many headaches…

    Reply

  2. sfdisk -d /dev/sda | /dev/sdb
    doesn’t work for me on CentOS 5.2
    sfdisk -d /dev/sda | sfdisk /dev/sdb
    works as expected

    Reply

  3. Before mounting md0 and md2 and copying files to them
    they may need to be formatted (right after step 2)
    mkfs.ext3 /dev/md0; mkfs.ext3 /dev/md2

    For impatient copying in step 3 can be shortened to:
    cp -ax / /mnt
    cp -ax /home /mnt/home
    This will preserve directories used for mounting like mnt, sys, proc, cdrom, …

    Reply

  4. Fixed the sfdisk line, Wes. Thanks.

    Regarding the cp, that assumes that the directories used for mounting like /dev, /mnt, /var/lock, etc are tmpfs. That is true for Ubuntu but not all distributions. Just running the cp -ax like that may copy the contents of those directories. Hence the cautious rsync approach.

    Reply

  5. Same setup (Ubuntu 8.04), but my problem for the past few days was 2-fold:

    1st at the end of step 4, grub-install didn’t work (hung), so I manually ran grub and entered:
    device (hd1) /dev/sdb
    root (hd1,0)
    setup (hd1)
    quit

    worked, but then on reboot at step 7 and selecting the second drive as the boot drive through the BIOS, the boot process would hang, dropping me to initramfs. At this point, I could mount /dev/md0 by

    mdadm -A /dev/md0 /dev/sdb1

    for each RAID partition, then exit from initramfs. Everything would boot then. After too many attempts at resolving this, I found that the problem was resolved by allowing the boot sequence to drop to initramfs, and enter

    mdadm -As –homehost=” –auto=yes –auto-update-homehost

    exit initramfs and allow the boot to complete. Reboot once (or twice), and everything works. From what I understand (noob here), when the RAID1 was created, the homehost= in mdadm.conf used by mdadm during the boot process didn’t match what was in the properties written to the RAID partition when it was created resulting in arrays not being mounted automatically. Several days of debugging were summarized in that simple line…

    That, and to read about the problem with degraded RAID arrays here:
    https://help.ubuntu.com/community/DegradedRAID

    Reply

  6. Marvelous article. Compact and with rich information.
    Unfortunately I have the most weird problem (in 8.04 server):
    After creating the raid arrays, rebooting successfully growing the raid (1 for me) arrays to include the second (sdb1, sdb2 etc…) devices, the system cannot boot (md devices are stopped during boot). It might have something to do with the order the modules are loaded (the sda and sdb sit in different controllers with different drivers), but it might have something to do with a mix in md arrays and sd devices. For example I have sda9-sdb9 that belong to md6 but the mdadm reports for the sda member:
    mdadm -E /dev/sda9
    /dev/sda9:
    Magic : a92b4efc
    Version : 00.90.00
    UUID : 61b6692d:3900807a:5e76b0e2:85a6fbc1 (local to host TransNetStreamer)
    Creation Time : Sat Jan 10 17:13:51 2009
    Raid Level : raid1
    Used Dev Size : 9735232 (9.28 GiB 9.97 GB)
    Array Size : 9735232 (9.28 GiB 9.97 GB)
    Raid Devices : 2
    Total Devices : 1
    Preferred Minor : 6

    Update Time : Sun Jan 11 20:15:26 2009
    State : clean
    Active Devices : 1
    Working Devices : 1
    Failed Devices : 1
    Spare Devices : 0
    Checksum : bb676a86 – correct
    Events : 0.760

    Number Major Minor RaidDevice State
    this 0 8 9 0 active sync /dev/sda9

    0 0 8 9 0 active sync /dev/sda9
    1 1 0 0 1 faulty removed
    while for the sdb member:
    mdadm -E /dev/sdb9
    /dev/sdb9:
    Magic : a92b4efc
    Version : 00.90.00
    UUID : 61b6692d:3900807a:5e76b0e2:85a6fbc1 (local to host TransNetStreamer)
    Creation Time : Sat Jan 10 17:13:51 2009
    Raid Level : raid1
    Used Dev Size : 9735232 (9.28 GiB 9.97 GB)
    Array Size : 9735232 (9.28 GiB 9.97 GB)
    Raid Devices : 2
    Total Devices : 2
    Preferred Minor : 6

    Update Time : Sun Jan 11 15:22:22 2009
    State : clean
    Active Devices : 2
    Working Devices : 2
    Failed Devices : 0
    Spare Devices : 0
    Checksum : bb67209c – correct
    Events : 0.66

    Number Major Minor RaidDevice State
    this 1 8 25 1 active sync /dev/sdb9

    0 0 8 9 0 active sync /dev/sda9
    1 1 8 25 1 active sync /dev/sdb9

    different things. HOW IS THAT possible ?

    Again thanks for your time

    Reply

  7. Yikes! I’ve never seen two parts of a RAID array show different results! If you solve this, please post it here.

    Reply

  8. Correction: the list of devices in the superblock can not match whenever the partitions are not actually in the array. I had a machine showing this problem until recently, but I can’t say how it came about. I just re-added the affected partition to the array and it resolved itself.

    Compare:

    mdadm –detail /dev/md6
    mdadm -E /dev/sda9
    mdadm -E /dev/sdb9

    If sdb9 still shows bad superblock data, I’d suggest removing it from the array and re-adding it.

    Reply

  9. Hi,

    Thanks for this document. I agree with Wes that it is best to format your disk right after step 2 if it already has data on it. I neglected to do that and on reboot it gave me this error:

    The filesystem size (according to the superblock) is nnn blocks
    The physical size of the device is yyy blocks
    Either the superblock or the partition table is likely to be corrupt!
    Abort?

    Of course this makes perfect sense as now both the original partition and the raid disk had their own superblocks on there. Formatting the md(x) partitions and rerunning the rsync solved that.

    Reply

  10. Thanks to several of you for pointing out the need to create filesystems on the new RAID partitions. I’ve edited the article to reflect that in step 2.

    Reply

  11. What a great post! I’m actually familiar with all the pieces, but it’s terrific to have them put together in a road map.
    Surprised the original Centos poster did not call out that the update-initramfs equivalent on Redhat/Centos/Fedora would be mkinitrd.

    Reply

  12. Hi Mudgen,

    What’s the exact command with mkinitrd? Any switches? It would be good to have it here.

    Reply

  13. Thanks for the very informative guide!
    I recently had to move my root file system from one pair of Raid1 disks to another pair of Raid1 disks, and was totally non-plussed to find that although grub appeared to be working, I would get dumped into a BusyBox prompt by initramfs.
    Although the BusyBox prompt didn’t say, I found I could run mdadm from there, and found that the UUID of the Raid1 filesystem was different from the new one I had prepared.
    After a bit of googing, I found this page which pointed me in the direction of a fix.
    Using your chroot method and update-initramfs worked like a charm.

    gracia amigo.

    Reply