-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 thus Pasi Kärkkäinen spake: > On Tue, Mar 02, 2010 at 09:30:50AM +0100, Timo Schoeler wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi list, >> >> please forgive cross posting, but I cannot specify the problem enough to >> say whether list it fits perfectly, so I'll ask on both. >> >> I have some machines based with following specs (see at the end of the >> email). >> >> They run CentOS 5.4 x86_64 with the latest patches applied, Xen-enabled >> and should host one or more domUs. I put the domUs' storage on LVM, as I >> learnt ages ago (what never caused any problems) and is way faster than >> using file-based 'images'. >> >> However, there's something special about these machines: They have the >> new WD EARS series drives, which use 4K sector sizes. So, I booted a >> rescue system and used fdisk to start at sector 64 instead of 63 (long >> story made short: Due to overhead causing the drive to do much more, >> inefficient writes when starting at sector 63, the performance >> collapses; with 'normal' geometry (sector 63), the drive achieves about >> 25MiByte/sec writes, with starting at sector 64 partition, it achieves >> almost 100MiByte/sec writes): >> >> [root at server2 ~]# fdisk -ul /dev/sda >> >> Disk /dev/sda: 1000.2 GB, 1000204886016 bytes >> 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors >> Units = sectors of 1 * 512 = 512 bytes >> >> Device Boot Start End Blocks Id System >> /dev/sda1 * 64 2097223 1048580 fd Linux raid >> autodetect >> Partition 1 does not end on cylinder boundary. >> /dev/sda2 2097224 18876487 8389632 82 Linux swap / Solaris >> /dev/sda3 18876488 1953525167 967324340 fd Linux raid >> autodetect >> >> On top of those (two per machine) WD EARS HDs there's ``md'' providing >> two RAID1, /boot and LVM, as well as swap per HD (i.e. non-RAIDed). LVM >> provides the / partition as well as LVs for Xen domUs. >> >> I have about 60 machines running that style and never had any problems. >> They run like a charm. On these machines, however, domUs are *very* >> slow, have a steady (!) load of about two -- 50% stating in 'wait' -- >> and all operations take ages, e.g. a ``yum update'' with the recently >> released updates. >> >> Now, can that be due to 4K issues I didn't see, nestet now in LVM? >> >> Help is very appreciated. >> > > Maybe the default LVM alignment is wrong for these drives.. > did you check/verify that? > > See: > http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/ > > Especially the "--metadatasize" option. Hi Pasi, hey lists, thanks for the hint. Following is the 'most important' part of the text: ``So I created a 1 gigabyte /boot partition as /dev/sdb1, and allocated the rest of the SSD for use by LVM as /dev/sdb2. And that’s where I ran into my next problem. LVM likes to allocate 192k for its header information, and 192k is not a multiple of 128k. So if you are creating file systems as logical volumes, and you want those volume to be properly aligned you have to tell LVM that it should reserve slightly more space for its meta-data, so that the physical extents that it allocates for its logical volumes are properly aligned. Unfortunately, the way this is done is slightly baroque: # pvcreate –metadatasize 250k /dev/sdb2 Physical volume “/dev/sdb2″ successfully created Why 250k and not 256k? I can’t tell you — sometimes the LVM tools aren’t terribly intuitive. However, you can test to make sure that physical extents start at the proper offset by using: # pvs /dev/sdb2 -o+pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/sdb2 lvm2 – 73.52G 73.52G 256.00K If you use a metadata size of 256k, the first PE will be at 320k instead of 256k. There really ought to be an –pe-align option to pvcreate, which would be far more user-friendly, but, we have to work with the tools that we have. Maybe in the next version of the LVM support tools….'' So, after taking care of starting at sector 64 *and* taking care ``pvcreate'' has its 'multiple of 128k', I still have the same problem. Most interestingly, debian 'lenny' does *not* have this problem. LVM's PV does *not* have to be like mentioned above. So, unfortunately, it seems like I'm forced to use debian in this project, at least on a few machines. *shiver* > -- Pasi Timo -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFLjin0fg746kcGBOwRAvp0AKC7TuCnrK63MOiqI8CK+m+XNgDqFgCfRvq+ DjcZJN8mCweY6jvAvTb90hg= =+E/H -----END PGP SIGNATURE-----