[CentOS-virt] Xen vs. iSCSI
Bill McGonigle
bill at bfccomputing.com
Tue Jun 16 04:11:45 UTC 2009
[previously sent to rhelv5 list, apologies to those on both]
I've got a problem I can reproduce easily enough, but really I fail to
understand what's going wrong.
I've got a 5.3 Dom0, which is running three guests. One is Fedora 10,
that runs with local flat files, and works fine. One is Nexenta 2
(opensolaris-based), and that runs off of physical partitions, and seems
to work great. The third runs Fedora 11 and and has for its disks,
iSCSI devices that are exported from Nexenta (ZFS-backed).
I have the Dom0 mapping the two iSCSI devices, one for /boot and one for
/. They're showing up initially as /dev/sdc and /dev/sdd.
If I go after the iSCSI devices in Dom0, with dd, for instance, they
work fine all day. I can read and write the entire devices to and from
local files without error. iSCSI seems to work properly in that regard.
I'm getting about 38MB/s. I've scrubbed the disk pool and no errors
were found and long SMART self-test passed on each of the disks. I've
also been able to mount both iSCSI devices and run bonnie++ on them
successfully from Dom0.
So, I specify those devices in the Xen config for the domU (tried both
real device name and /dev/disk/by-path/ names) and the DomU boots and
operates as I'd expect. Installation worked fine and typical operations
(low volume) work. However, then I try to do something, which I'm
assuming is more disk intensive, like running a yum update, and iSCSI
seems to fall over.
In the DomU, I'll see a lock-up, and then filesystem errors. e.g.:
Installing : kernel [############################################
] 1/33EXT3-fs error (device xvda1) in ext3_ordered_writepage: IO failure
In the Dom0, I'll see:
sd 6:0:0:0: timing out command, waited 360s
sd 6:0:0:0: SCSI error: return code = 0x06050000
end_request: I/O error, dev sdc, sector 37319
sd 7:0:0:0: timing out command, waited 360s
sd 7:0:0:0: SCSI error: return code = 0x06000000
end_request: I/O error, dev sdd, sector 29792137
sd 7:0:0:0: timing out command, waited 360s
sd 7:0:0:0: SCSI error: return code = 0x06000000
end_request: I/O error, dev sdd, sector 29792313
Both (all) iSCSI devices are failed. Under iostat I see activity to the
iSCSI block devices, and the whole machine acts mostly I/O blocked (even
the Fedora 10 DomU running on flat files will start throwing nagios into
a tizzy). If I do 'service iscsi stop' everything picks right back up
(though the DomU using them as its disks is obviously unhappy).
When I start iscsi again I can pick right back up (after repairing
filesystems in the DomU), and I can repeat the process at will.
Sometimes the disks will come back as, e.g. sdd and sde, leaving me to
think something still has a handle on sdc. But lsof shows nothing in dom0.
One thing that stood out were some of the block and sector number errors
being right on power of two boundries:
scsi 7:0:0:0: SCSI error: return code = 0x00010000
end_request: I/O error, dev sdd, sector 32768
Buffer I/O error on device sdd, logical block 4096
Buffer I/O error on device sdd, logical block 4097
Buffer I/O error on device sdd, logical block 4098
Buffer I/O error on device sdd, logical block 4099
Buffer I/O error on device sdd, logical block 4100
Buffer I/O error on device sdd, logical block 4101
Buffer I/O error on device sdd, logical block 4102
Buffer I/O error on device sdd, logical block 4103
Buffer I/O error on device sdd, logical block 4104
Buffer I/O error on device sdd, logical block 4105
scsi 7:0:0:0: rejecting I/O to dead device
but as I opened with, I'm sort of at as loss as to what is actually
causing the problem. Any suggestions for further troubleshooting and/or
ideas about what's happening appreciated.
Thanks,
-Bill
--
Bill McGonigle, Owner Work: 603.448.4440
BFC Computing, LLC Home: 603.448.1668
http://www.bfccomputing.com/ Cell: 603.252.2606
Twitter, etc.: bill_mcgonigle Page: 603.442.1833
Email, IM, VOIP: bill at bfccomputing.com
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf
More information about the CentOS-virt
mailing list