[previously sent to rhelv5 list, apologies to those on both] I've got a problem I can reproduce easily enough, but really I fail to understand what's going wrong. I've got a 5.3 Dom0, which is running three guests. One is Fedora 10, that runs with local flat files, and works fine. One is Nexenta 2 (opensolaris-based), and that runs off of physical partitions, and seems to work great. The third runs Fedora 11 and and has for its disks, iSCSI devices that are exported from Nexenta (ZFS-backed). I have the Dom0 mapping the two iSCSI devices, one for /boot and one for /. They're showing up initially as /dev/sdc and /dev/sdd. If I go after the iSCSI devices in Dom0, with dd, for instance, they work fine all day. I can read and write the entire devices to and from local files without error. iSCSI seems to work properly in that regard. I'm getting about 38MB/s. I've scrubbed the disk pool and no errors were found and long SMART self-test passed on each of the disks. I've also been able to mount both iSCSI devices and run bonnie++ on them successfully from Dom0. So, I specify those devices in the Xen config for the domU (tried both real device name and /dev/disk/by-path/ names) and the DomU boots and operates as I'd expect. Installation worked fine and typical operations (low volume) work. However, then I try to do something, which I'm assuming is more disk intensive, like running a yum update, and iSCSI seems to fall over. In the DomU, I'll see a lock-up, and then filesystem errors. e.g.: Installing : kernel [############################################ ] 1/33EXT3-fs error (device xvda1) in ext3_ordered_writepage: IO failure In the Dom0, I'll see: sd 6:0:0:0: timing out command, waited 360s sd 6:0:0:0: SCSI error: return code = 0x06050000 end_request: I/O error, dev sdc, sector 37319 sd 7:0:0:0: timing out command, waited 360s sd 7:0:0:0: SCSI error: return code = 0x06000000 end_request: I/O error, dev sdd, sector 29792137 sd 7:0:0:0: timing out command, waited 360s sd 7:0:0:0: SCSI error: return code = 0x06000000 end_request: I/O error, dev sdd, sector 29792313 Both (all) iSCSI devices are failed. Under iostat I see activity to the iSCSI block devices, and the whole machine acts mostly I/O blocked (even the Fedora 10 DomU running on flat files will start throwing nagios into a tizzy). If I do 'service iscsi stop' everything picks right back up (though the DomU using them as its disks is obviously unhappy). When I start iscsi again I can pick right back up (after repairing filesystems in the DomU), and I can repeat the process at will. Sometimes the disks will come back as, e.g. sdd and sde, leaving me to think something still has a handle on sdc. But lsof shows nothing in dom0. One thing that stood out were some of the block and sector number errors being right on power of two boundries: scsi 7:0:0:0: SCSI error: return code = 0x00010000 end_request: I/O error, dev sdd, sector 32768 Buffer I/O error on device sdd, logical block 4096 Buffer I/O error on device sdd, logical block 4097 Buffer I/O error on device sdd, logical block 4098 Buffer I/O error on device sdd, logical block 4099 Buffer I/O error on device sdd, logical block 4100 Buffer I/O error on device sdd, logical block 4101 Buffer I/O error on device sdd, logical block 4102 Buffer I/O error on device sdd, logical block 4103 Buffer I/O error on device sdd, logical block 4104 Buffer I/O error on device sdd, logical block 4105 scsi 7:0:0:0: rejecting I/O to dead device but as I opened with, I'm sort of at as loss as to what is actually causing the problem. Any suggestions for further troubleshooting and/or ideas about what's happening appreciated. Thanks, -Bill -- Bill McGonigle, Owner Work: 603.448.4440 BFC Computing, LLC Home: 603.448.1668 http://www.bfccomputing.com/ Cell: 603.252.2606 Twitter, etc.: bill_mcgonigle Page: 603.442.1833 Email, IM, VOIP: bill at bfccomputing.com Blog: http://blog.bfccomputing.com/ VCard: http://bfccomputing.com/vcard/bill.vcf