For When You Can't Have The Real Thing
[ start | index | login ]
start > Linux > Software Raid Quick Howto

Software Raid Quick Howto

Created by dave. Last edited by dave, 15 years and 295 days ago. Viewed 10,981 times. #18
[diff] [history] [edit] [rdf]
labels
attachments

CentOS 5 and other ravings

These instructions do not work for CentOS 5 at the moment. My theory is that the SeLinux ACLs are not getting ported properly when the filesystems are moved from the original disk to the one-armed mirror; this can be corrected/avoided by using star instead of tar when populating the one-armed mirror for the first time. However I don't have time to figure out exactly what is going wrong right now, so just take the warning.

LVM

These instructions have not been tested with LVM so may not work there either.

Your backups are good, right?

OK the other pages I have here are either incoherent or wrong

...and I'm pissed. The instructions had me boot before grub was properly set up so I had to truck in here to use the emergency boot media. So while I wait for a resync to happen so I can finish the work I actually have to do tonight, here are the working highlights for how to do it properly.

What you need:

  • An installed Linux on one hard disk.
  • A second hard disk which is identical to the one you have linux already running on, already inserted into, and visible to, the system. For IDE disks it is always best to use separate IDE chains; ie don't raid hda to hdb, or hdc to hdd. For SCSI disks it is ideal to raid across controllers, but the vast majority of the time you don't have that luxury. SATA disks seem to treat each drive as a separate controller interface which is good enough for our purposes here. Now I know lots of instructions say you can have mismatched sizes, but if you want to believe that go find those instructions and good luck to you.
  • All the processes that might have live, mutable data, DISABLED.
  • No users logged onto the system. If you can do this in single-user mode with no network, that is safest.
  • Although you can do this all remotely (assuming the disk is already inserted and ready to go), it is a good idea to have physical access to the machine and emergency boot media for when you fuck it up, either because you made a mistake, or you carefully followed a mistake I told you to make.
So:
  • call up fdisk on the working disk.
  • make careful note of the partition order and sizes.
  • change all the partition types to fd, AKA Linux raid autodetect. You should end up with something like:
Disk /dev/sda: 250.0 GB, 250056000000 bytes
255 heads, 63 sectors/track, 30400 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System /dev/sda1 * 1 33 265041 fd Linux raid autodetect /dev/sda2 34 294 2096482+ fd Linux raid autodetect /dev/sda3 295 30400 241826445 fd Linux raid autodetect

  • Note that in this case we have a separate /boot partition for the usual no good reason.
  • Write/Quit out of fdisk to flush your changes to disk. The kernel will use its cached copy until the next boot.
  • Fire up fdisk on the new drive. Duplicate the partition order, sizes, and types. When you finish, you should end up with something like:
Disk /dev/sdb: 250.0 GB, 250056000000 bytes
255 heads, 63 sectors/track, 30400 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System /dev/sdb1 1 33 265041 fd Linux raid autodetect /dev/sdb2 34 294 2096482+ fd Linux raid autodetect /dev/sdb3 295 30400 241826445 fd Linux raid autodetect

  • If this isn't EXACTLY LIKE THE ONE FOR THE FIRST DISK (minus the obvious changes to the device names (ie sdb instead of sda) and a * bootable partition marker) THEN IT IS WRONG, STOP, AND MAKE IT RIGHT.
  • Write/Quit out of fdisk to flush your changes to disk.
  • Use sfdisk to dump the partition information. This will be very useful if when a drive fails.
# sfdisk -d /dev/sda > /etc/partitions.sda
# sfdisk -d /dev/sdb > /etc/partitions.sdb
  • Create the arms of your mirror that will live on the second (ie the currently unused) disk. (Bolded to remind myself where I fucked up the first time.) The mdadm options used are –C=create mode, -v=verbose, -ln=RAID Level(one in this case), and the number of disks in the array –nn. The array name is specified as mdn and it uses sdbn and “missing”:
# mdadm –Cv –l1 –n2 /dev/md1 /dev/sdb1 missing
# mdadm –Cv –l1 –n2 /dev/md2 /dev/sdb2 missing
# mdadm –Cv –l1 –n2 /dev/md3 /dev/sdb3 missing
  • Note for CentOS5 or RHEL5 users: you want to add a -a yes to this command for example
    # mdadm –Cv –l1 –n2 -a yes /dev/md1 /dev/sdb1 missing
    This is because it relies on udev and hotplug to create device and it only creates /dev/md0 by default.
  • Note for users with more modern distributions: It looks like the latest version of mdadm uses version 1.0 superblocks by default. These will not be autodetected on startup, as the maintainer has altered them specifically not to?! I got around this by adding the '-e 0.90' switch to these lines, thus creating a superblock that will be detected on boot (the official line is that you should create an initrd). This bit me on both OpenSuSE 10.2 and a Gentoo thing.
  • OK at this point you are nearly committed, because if the system boots in this state it will be some pissed if it comes up and the partitions have the wrong type. So you might as well change your entries in /etc/fstab to point to the mirror devices instead of the disk devices. Our example system has these appropriate fstab entries (plus all the other housekeeping entries modern linux systems like):
/dev/md3   /         ext3    defaults        1 1
/dev/md1   /boot     ext2    defaults        1 2
/dev/md2   swap      swap    defaults        0 0
  • Similarly, update your /boot/grub/grub.conf file so that the system is told to use the raid-ed root device. For example:
title CentOS (2.6.9-42.0.3.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.6.9-42.0.3.ELsmp ro root=/dev/md3 quiet
        initrd /initrd-2.6.9-42.0.3.ELsmp.img
  • Now you better update your initrd file so that the boot environment can find and then make the special raid devices. Note the names below come straight from the grub.conf example above. For many (most?) modern distributions this isn't necessary as the distributed initrd already has them, but this can't hurt any Just do it, I've had several failures without it:
# cd /boot
# mv initrd-`uname -r`.img initrd-`uname -r`.img.1
# mkinitrd initrd-`uname -r`.img `uname -r`
  • If you have other selectable kernels, modify their root= parameter plus their initrd files too.
  • Commentary: the instructions I was using told you to change these on the original drives after doing the data copy (coming up in a step or two), so you had to remember to copy the changed fstab and grub.conf file someplace where you could put them into place on the mirror devices later. It makes more sense to me to do this before the data copy stage.
  • Create the filesystems (or swap space) on your one-armed mirrors. Recall that this example has /boot on sda1, swap on sda2, and root on sda3. We don't know any better so we like ext3:
# mkfs.ext3 /dev/md1
# mkfs.ext3 /dev/md3
# mkswap /dev/md2
  • Create mountpoints for your new filesystems, then mount the one-armed mirrors there. Example for our example:
# mkdir /mnt/md1 /mnt/md3
# mount /dev/md1 /mnt/md1
# mount /dev/md3 /mnt/md3
  • Copy the contents of each file system to the corresponding mirror. Example for our example, RedHat speak:
# cd /boot
# tar cfpl - . | (cd /mnt/md1 ; tar xfp -)
# cd /
# tar cfpl - . | (cd /mnt/md3 ; tar xfp -)
  • Note: If you have a slightly more modern tar, it will either bitch about -l being depreciated, or do something else altogether; in that case, you want to do the below instead. (Also Note that you can't use the below on RedHat systems, because RedHat's tar doesn't understand --one-file-system, so will instead create a tar file called --one-file-system and try to tar up stdin as well. Ain't standardization grand?!)
# cd /boot
# tar cfp --one-file-system - . | (cd /mnt/md1 ; tar xfp -)
# cd /
# tar cfp --one-file-system - . | (cd /mnt/md3 ; tar xfp -)
  • Next run grub to make sure your system is actually bootable in this configuration. (Bolded to remind me where I fucked up the second time.) NOTE: Do not be confused! The “hd0” notation within grub is not the same as normal device notation when referring to devices in LINUX. The “device” command specifies which disk Grub will operate on (ie the disk where /boot's contents are). Grub addresses the entire disk at the hardware level.
# grub
> device (hd0) /dev/sda
> root (hd0,0)
> setup (hd0)
(blah blah blah)
Running "install /boot/grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"… succeeded
Done.
> quit
  • Deep breath; this is the dangerous bit. Ensure you know where your emergency boot media is.
# sync
# reboot
  • Things to look for if it doesn't boot:
    - no root=/dev/md$device entry in /boot/grub/grub.conf
    - no raid support in initrd file
    - you didn't run grub as above before rebooting
  • Now that it's rebooted, you can add the partitions on the first disk to the mirrors. Free advice: add them in the order of smallest to largest. This is because the resync happens one partition at a time, in (approximately) the order you add them. Also, all things being equal, it is nice to have the partion with the /boot information done first so you can do the step after this.
# mdadm /dev/md1 –a /dev/sda1
# mdadm /dev/md2 –a /dev/sda2
# mdadm /dev/md3 –a /dev/sda3
  • Wait until the partition with /boot has been synced. For single-partition systems with a large partition, this can take a long time (five hours for one 250G partition, for example; not that this was gained in any practical experience at all). You can monitor progress via /dev/mdstat:
# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda2[1] sdb2[0]
      2096384 blocks [2/2] [UU]

md3 : active raid1 sda3[2] sdb3[0] 241826368 blocks [2/1] [U_] [===>.................] recovery = 18.2% (44107968/241826368) finish=375.9min speed=8761K/sec md1 : active raid1 sda1[1] sdb1[0] 264960 blocks [2/2] [UU]

unused devices: <none>

  • Make both disks bootable. (Bolded to remind me where I fucked up the third time.) This is because your existing configuration has pointers into the disk partion with /boot, but the pieces have been probably moved around (ie cpio'd to the other disk, then mirrored back). Also you want the second disk to be set up correctly for when the first disk dies and you have to reboot from the survivor in order to get your system back. Remember the Note from the last time we ran grub. Also note this time that the two invocations of grub are identical except for the device line.
# grub
> device (hd0) /dev/sda
> root (hd0,0)
> setup (hd0)
(blah blah blah)
Running "install /boot/grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"… succeeded
Done.
> quit
# grub
> device (hd0) /dev/sdb
> root (hd0,0)
> setup (hd0)
(blah blah blah)
Running "install /boot/grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"… succeeded
Done.
> quit
  • Finally, wait for all partitions to finish resyncing, then reboot. If your system comes up properly, you are done.

mdmonitor

Redhat/CentOS comes with a nifty startup script called mdmonitor. This script does exactly nothing if you don't have an /etc/mdadm.conf file.

You want your /etc/mdadm.conf file to look something like this:

DEVICE /dev/sda1 /dev/sda2 /dev/sdb1 /dev/sdb2
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=7f651399:33695c4d:ff3b3409:7166caac
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=96eed041:387c4e0e:197823fc:868814d2
MAILADDR root

...adjust for your local values.

The MAILADDR is not optional if you want mdadm to actually monitor your devices. If you do this, mdadm will send an email when a mirror arm fails.

You can figure out what the array UUIDs are by running mdadm --examine on one of the raid arms:

# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 7f651399:33695c4d:ff3b3409:7166caac
  Creation Time : Thu Apr 24 14:33:52 2008
     Raid Level : raid1
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Tue Jun 10 13:37:32 2008 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 5f0d63b2 - correct Events : 0.642371

Number Major Minor RaidDevice State this 1 8 1 1 active sync /dev/sda1

References

Sources: the two documents linked to on the Linux/Software Raid and Grub HOW-TO, three practical system implementations, and one VERY LATE NIGHT at a customer site due to the ordering problems in the originals.

See Also: Software Raid Failed Disk Howto

no comments | post comment
This is a collection of techical information, much of it learned the hard way. Consider it a lab book or a /info directory. I doubt much of it will be of use to anyone else.

Useful:


snipsnap.org | Copyright 2000-2002 Matthias L. Jugel and Stephan J. Schmidt