[CentOS] Growing HW RAID arrays, Online

Sat Feb 22 00:50:35 UTC 2014
Phoenix, Merka <merka.phoenix at hp.com>

Hi Billy,

>> add disks to an LSI raid array periodically to increase the amount of available space for business needs
>> sdc1 is a PV in a VG that holds production data and must not become unavailable at any time
>> How do we grow sdc1, online?

If you are using the Logical Volume Manager (LVM ) on Linux, you should not have to grow the PV each time.
Instead, carve Logical Units (LUNs) out of the RAID array and present them to the operating system as disk devices that can be initialized as physical volumes (PVs).

LVM can then be used to add (or remove) PVs to a Volume Group (VG) without having to reboot. Logical volumes (LVs) are carved out of the VG and present to the operating system (Linux) as block devices on which filesystems (one filesystem per block device) can be created.

While you can resize the H/W RAID "online", and add/remove PVs to a VG "online", you still need to unmount a filesystem before resizing both the LV and the corresponding filesystem that was created on the LV. Attempting to resize a filesystem that is mounted and actively being used is just asking for data corruption.

Both the LV and the filesystem can be resized "on the fly" without rebooting, but you still have to unmount the filesystem first before resizing either. Resizing the 'root' filesystem (or any filesystem on which the core operating system has files open) requires shutting down and booting into an alternate boot env (for example, the "rescue" boot cd) -- presenting another good argument for separating operating system files and user/application files. This is why /var/log is often mounted on a separate filesystem.

At the filesystem level, remember that Linux allows you to mount filesystems at various mount points within the directory tree. Most systems do not have the entire directory tree contained on a single filesystem. The 'root' filesystem is typically just large enough to hold the basic operating system, and then the rest of the files (applications, user data, and application data) are stored on separate filesystems that are mounted on the directory tree at various points (for example /home, /data, /opt, /opt/dedicated/app1, etc.)

>From the user (and applications') view, the files still appear to be stored on one gigantic single filesystem, even though it is actually mapped out to two or more filesystems.

By structuring/segmenting the system's directory tree this way, you gain the ability to unmount and resize portions of the tree without having to shutdown and reboot. The 'fsck' pass also runs much more quickly when your filesystems are not in the terabyte size. Unless you are creating individual files that are gigabytes/terabytes in size, there is little benefit in having a massive filesystem (250 GB or larger). Remember, the larger the filesystem, the flatter your data becomes when (not if) the filesystem fails.


Cheers!

Simba
Engineering

-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On Behalf Of Billy Crook
Sent: Thursday, 20 February, 2014 13:50
To: CentOS mailing list
Subject: [CentOS] Growing HW RAID arrays, Online

We add disks to an LSI raid array periodically to increase the amount of available space for business needs.

It is understood that this process starts with metal, and has many layers that must each adjust to make use of the additional space.
Each of these layers also says that it can do that 'online' without interruption or rebooting.  But making it happen is not that easy.

When the HW raid controller's grow completes, we echo 1 to /sys/bus/scsi/devices/#:#:#:#/rescan and the kernel notices and updates the size of the block device. (sdc in this case) sdc1 is the only partition on the device, and should consume the entire device.

sdc1 is a PV in a VG that holds production data and must not become unavailable at any time.  After growing sdc as mentioned earlier, parted notices that the end-of-drive partition table is missing, fixes it, and grows its disk size to match the new size of sdc.

It all makes sense up to this point. but what happens next is what I need some advice on. How do we grow sdc1, online? parted says it doesn't support 'resize' on the filesystem (LVM PV).

The usual answer to parted's lack of support for filesystems, and insistence to only resize partitions when it can stick its nose in the filesystem and do that too is: parted sdc rm 1, parted mkpart primary 0% 100%  (thus making a new partition "Around" the old one)

That should work, but I can't get the kernel to 'notice' that sdc1 is now larger. hdparm -z barfs up an error that sdc is in use. I know that rebooting likely will fix it. But we cannot reboot. We also cannot keep making more partitions every time we add a disk. So that's not a solution either. We need to GROW the gpt partitions online, or use another partitioning type that supports >6TB I've googled it for hours and found no good solutions.

This same situation would affect VMs with virtual disks that grow over time to satisfy business needs, as well as servers mounting iSCSI/FC storage that grows over time. How would you grow this online?

Going without partitions at all and putting the pv directly on sdc is no good either.  So we need partitions, and msdos tables don't support
>2TB and the only other in practical use that I know of is gpt, and
thse aparently can't expand online!

Here's where I'm at now in case you're curious [root at host lib]# parted /dev/sdc unit s print free

Model: SMC SMC2108 (scsi)
Disk /dev/sdc: 8189439999s
Sector size (logical/physical): 512B/512B Partition Table: gpt

Number  Start  End          Size         File system  Name     Flags
 1      34s    8189439966s  8189439933s               primary

Information: Don't forget to update /etc/fstab, if necessary.

[root at host lib]# cat  /sys/bus/scsi/devices/0\:2\:2\:0/block\:sdc/size
8189440000
[root at host lib]# cat  /sys/bus/scsi/devices/0\:2\:2\:0/block\:sdc/sdc1/size
7019519933

# I bet the above 7billion will be around 8billion at the next reboot.
 (Each physical disk has about a billion sectors

--
Billy Crook * Network and Security Administrator * RiskAnalytics, LLC _______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos