Saturday, May 2, 2009

Mounting Unknown RAID+LVM Volumes Step-by-Step

Every time your system boots, startup scripts will handle all disk mounting tasks. This is good, yet unfortunate if you ever run in to problems, because you're going to need to know how to get things up without the help of the scripts. On my systems, I run a combination of LVM volumes on top of multiple RAID arrays. This provides data integrity and future flexibility to add more space as needed, and makes recovery very complicated.

There are two cases where you will eventually need to know how this all works. The most likely case is recovering from a boot failure, in which you are relatively familiar with the configuration. An even more difficult case is mounting a set of unknown drives with a RAID/LVM configuration. Keep reading and I'll explain how you can handle even the hardest case like a pro.

If there is a chance you will be mounting two or more LVM Volume Groups with identical names, then read the warning at end before continuing.

  1. After installing all the drives into the system, verify they are detected in /dev/sd*:
    $ ls /dev/sd*
    /dev/sda   /dev/sdb   /dev/sdc   /dev/sdd   /dev/sde   /dev/sdf   /dev/sdf3
    /dev/sda1  /dev/sdb1  /dev/sdc1  /dev/sdd1  /dev/sde1  /dev/sdf1  /dev/sdf4
    /dev/sda2  /dev/sdb2  /dev/sdc2  /dev/sdd2  /dev/sde2  /dev/sdf2
    
  2. Find out which partitions of those are mounted:
    $ df -lhx tmpfs 
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/mapper/lvm_vg0-root
                           10G  3.7G  6.4G  37% /
    /dev/mapper/lvm_vg0-home
                          160G   83G   78G  52% /home
    /dev/md0               99M   16M   79M  17% /boot
    
    This output doesn't directly indicate what partitions are being used. Instead it indicates two LVM volumes; /dev/mapper/lvm_vg0-root, /dev/mapper/lvm_vg0-home and one RAID array; /dev/md0. The next steps will reduce this info to the physical partitions.
    1. Lookup the two LVM volumes lvm_vg0-root and lvm_vg0-home to determine the LVM volume group (VG):
      $ sudo lvdisplay | grep 'VG Name'
        VG Name                lvm_vg0
        VG Name                lvm_vg0
      
    2. The LVM VG, lvm_vg0, maps to one ore more physical volumes:
      $ sudo pvdisplay
        --- Physical volume ---
        PV Name               /dev/md1
        VG Name               lvm_vg0
        PV Size               297.99 GB / not usable 3.25 MB
        Allocatable           yes 
        PE Size (KByte)       4096
        Total PE              76285
        Free PE               32765
        Allocated PE          43520
        PV UUID               mD6jat-SvPv-dOop-P3qZ-T7zC-r8ke-xSYRSj
      
      This output shows us that the LVM pysical volume (PV) is using /dev/md1.
    3. Now, lookup md0 (from the top step) and md1 devices:
      $ cat /proc/mdstat
      Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
      md1 : active raid10 sde2[0] sdd2[1]
            312466688 blocks 64K chunks 2 far-copies [2/2] [UU]
      
      md0 : active raid1 sde1[0] sdd1[1]
            104320 blocks [2/2] [UU]
      
      OK, so the final answer is that md1 is using partitions: sde2, sdd2 and array md0 is using partitions: sde1, sdd1.
  3. Now, create list of unmounted partitions, by filtering out the mounted partitions:
    $ ls --color=none /dev/sd* | grep [0-9] | egrep -v 'sde2|sdd2|sde1|sdd1'
    /dev/sda1
    /dev/sda2
    /dev/sdb1
    /dev/sdb2
    /dev/sdc1
    /dev/sdc2
    /dev/sdf1
    /dev/sdf2
    /dev/sdf3
    /dev/sdf4
    
  4. To find RAID partitions, every device can be scanned for volume UUID value:
    $ mdadm --examine /dev/sda1 | grep UUID
               UUID : ae9a1594:27fc3cec:ceff8456:d50a26a6
    
    With a little more work, all the unmounted devices can be scanned in a single pass:
    $ ls --color=none /dev/sd* | grep [0-9] | egrep -v 'sde2|sdd2|sde1|sdd1' | xargs -i sh -c 'sudo mdadm --examine {} | grep -H --label={} UUID'
    /dev/sda1:           UUID : ae9a1594:27fc3cec:ceff8456:d50a26a6
    /dev/sda2:           UUID : cd72d060:e1139a47:8d379926:979de13e
    /dev/sdb1:           UUID : ae9a1594:27fc3cec:ceff8456:d50a26a6
    /dev/sdb2:           UUID : 2eb0a293:e7705a2f:8d379926:979de13e
    mdadm: No md superblock detected on /dev/sdc1.
    /dev/sdc2:           UUID : 29d74ba3:2dff5d1d:e1635e97:4ded1971
    mdadm: No md superblock detected on /dev/sdf1.
    /dev/sdf2:           UUID : 29d74ba3:2dff5d1d:e1635e97:4ded1971
    /dev/sdf3:           UUID : 2eb0a293:e7705a2f:8d379926:979de13e
    /dev/sdf4:           UUID : cd72d060:e1139a47:8d379926:979de13e
    
    It looks like every device is part of a RAID array except /dev/sdc1 and /dev/sdf1.
  5. Manually sort the list in the last step by the UUID values. Every matching UUID value indicates a single RAID array. Here is what your list should look like:
    /dev/sda1:           UUID : ae9a1594:27fc3cec:ceff8456:d50a26a6
    /dev/sdb1:           UUID : ae9a1594:27fc3cec:ceff8456:d50a26a6
    -----
    /dev/sda2:           UUID : cd72d060:e1139a47:8d379926:979de13e
    /dev/sdf4:           UUID : cd72d060:e1139a47:8d379926:979de13e
    -----
    /dev/sdb2:           UUID : 2eb0a293:e7705a2f:8d379926:979de13e
    /dev/sdf3:           UUID : 2eb0a293:e7705a2f:8d379926:979de13e
    -----
    /dev/sdc2:           UUID : 29d74ba3:2dff5d1d:e1635e97:4ded1971
    /dev/sdf2:           UUID : 29d74ba3:2dff5d1d:e1635e97:4ded1971
    
    Using these value, assemble each RAID array. The /dev/md* is not important as long as you choose a value that does not currently exists in the system.
    $ sudo mdadm --assemble /dev/md2 /dev/sda1 /dev/sdb1 
    mdadm: /dev/md2 has been started with 2 drives.
    
    $ sudo mdadm --assemble /dev/md3 /dev/sda2 /dev/sdf4 
    mdadm: /dev/md3 has been started with 2 drives.
    
    $ sudo mdadm --assemble /dev/md4 /dev/sdb2 /dev/sdf3 
    mdadm: /dev/md4 has been started with 2 drives.
    
    $ sudo mdadm --assemble /dev/md5 /dev/sdc2 /dev/sdf2 
    mdadm: /dev/md5 has been started with 2 drives.
    
    Now check that all the previous steps by printing the mdstat file:
    $ cat /proc/mdstat 
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md5 : active raid10 sdc2[0] sdf2[1]
          488281984 blocks 64K chunks 2 far-copies [2/2] [UU]
    
    md4 : active raid10 sdb2[0] sdf3[1]
          244094080 blocks 64K chunks 2 far-copies [2/2] [UU]
    
    md3 : active raid10 sda2[0] sdf4[1]
          244094080 blocks 64K chunks 2 far-copies [2/2] [UU]
    
    md2 : active raid1 sdb1[0] sda1[1]
          104320 blocks [2/2] [UU]
    
    md1 : active raid10 sde2[0] sdd2[1]
          312466688 blocks 64K chunks 2 far-copies [2/2] [UU]
    
    md0 : active raid1 sde1[0] sdd1[1]
          104320 blocks [2/2] [UU]
    
    unused devices: 
    
  6. When you are happy with the device mappings, update the /etc/mdadm/mdadm.conf file:
    $ sudo sh -c 'mdadm --detail --scan >> /etc/mdadm/mdadm.conf'
    
    After this step, edit the file to remove any old entries. Because of a bug in mdadm, change the "metadata" version from "00.90" to "0.90".
  7. Now, with the RAID arrays active, scan for LVM volumes. First the physical volumes (PV):
    $ sudo pvscan
      PV /dev/md4   VG lvm_vg0   lvm2 [232.79 GB / 0    free]
      PV /dev/md3   VG lvm_vg0   lvm2 [232.79 GB / 956.00 MB free]
      PV /dev/md5   VG lvm_vg0   lvm2 [465.66 GB / 274.60 GB free]
      PV /dev/md1   VG lvm_vg0   lvm2 [297.99 GB / 127.99 GB free]
      Total: 4 [1.20 TB] / in use: 4 [1.20 TB] / in no VG: 0 [0   ]
    
    Next the volume groups (VG):
    $ sudo vgdisplay 
      --- Volume group ---
      VG Name               lvm_vg0
      System ID             
      Format                lvm2
      Metadata Areas        3
      Metadata Sequence No  58
      VG Access             read/write
      VG Status             resizable
      MAX LV                0
      Cur LV                3
      Open LV               1
      Max PV                0
      Cur PV                3
      Act PV                3
      VG Size               931.23 GB
      PE Size               4.00 MB
      Total PE              238395
      Alloc PE / Size       167859 / 655.70 GB
      Free  PE / Size       70536 / 275.53 GB
      VG UUID               yLjVr3-BKkp-BW33-QIsh-rpZX-byL7-bzF2Xg
    
      --- Volume group ---
      VG Name               lvm_vg0
      System ID             
      Format                lvm2
      Metadata Areas        1
      Metadata Sequence No  3
      VG Access             read/write
      VG Status             resizable
      MAX LV                0
      Cur LV                2
      Open LV               2
      Max PV                0
      Cur PV                1
      Act PV                1
      VG Size               297.99 GB
      PE Size               4.00 MB
      Total PE              76285
      Alloc PE / Size       43520 / 170.00 GB
      Free  PE / Size       32765 / 127.99 GB
      VG UUID               138ikB-v1Z0-381g-Qvi8-dWN2-pgJ7-namuLh
    

    Be very careful if your system indicates identically named LVM Volume Groups (shown above). LVM will NOT let you mount the volumes from one or all of the groups. You must resolve the name conflict before continuing.

    Additionally, you will not be able to rename LVM VGs if either the old or new VG has mounted logical volumes! If one of your VG's LV's is the root volume, you will have to resort to a live-boot CD.

    A VG can be renamed with the following command:
    $vgrename vg0-old vg0-new
    
  8. Finally, mount the new volumes and update your /etc/fstab like normal.

0 comments:

Post a Comment