On 2016-04-14 07:10 AM, Hans Loots wrote: > Hello Nathan, dear all, > > > We were attempting to use scsi-target-utils, hosted on a xen dom0 vm > using localhost, > > and running into some problems. I was not able to reproduce this on > a centos 7.2 > > server using the default kernel. > > I am seeing comparable things on our centos6 xen servers running 3.18 > kernels. We have about 20 of those machines running and have started > upgrading them from 3.10.68 to 3.18 a couple of weeks ago. But > currently, at 3/4 of finishing, I'm having second thoughts and am > thinking about rolling back because of reliability issues. > > Stuff I've tried before is taking care that all machine runs latest > BIOS'es and ethernet firmware. The servers in question are Dell > PowerEdges from different generations, talking to an Equallogic > diskarray over 1Gbit copper. Dells toolset is installed, OMSA as well > as hitkit. > > The errors I'm seeing are looking like this: > > Apr 13 23:03:43 xen15-2 iscsid: Kernel reported iSCSI connection 25:0 > error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (1) > Apr 13 23:03:43 xen15-2 iscsid: Connection25:0 to [target: iqn.xxxxx, > portal: a.b.c.d,3260] through [iface: eql.em2] is operational now > Apr 13 23:03:48 xen15-2 iscsid: Connection9:0 to [target: iqn.xxxxx, > portal: a.b.c.d,3260] through [iface: eql.em2] is shutdown. > I did not have interface shutdowns in my tests (well, Network Manager was doing something there but I disabled it for my tests). The hardware is an old Tyan S2882D motherboard, 8GB Ram, and 2x Opteron 275 processors (Dual Core). > While the the only noticeable difference in dmesg output is stuff like > this: > (on 3.18) > pci 0000:02:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: > no compatible bridge window > pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: > no compatible bridge window > pci 0000:01:00.0: BAR 6: assigned [mem 0x91e80000-0x91efffff pref] > pci 0000:01:00.1: BAR 6: no space for [mem size 0x00080000 pref] > pci 0000:01:00.1: BAR 6: failed to assign [mem size 0x00080000 pref] > pci 0000:01:00.2: BAR 6: no space for [mem size 0x00080000 pref] > pci 0000:01:00.2: BAR 6: failed to assign [mem size 0x00080000 pref] > pci 0000:01:00.3: BAR 6: no space for [mem size 0x00080000 pref] > pci 0000:01:00.3: BAR 6: failed to assign [mem size 0x00080000 pref] > (and on 3.10) > pci 0000:00:03.0: BAR 15: assigned [mem 0xd5200000-0xd53fffff pref] > pci 0000:01:00.1: BAR 6: assigned [mem 0xd5000000-0xd507ffff pref] > pci 0000:01:00.2: BAR 6: assigned [mem 0xd5080000-0xd50fffff pref] > pci 0000:01:00.3: BAR 6: assigned [mem 0xd5100000-0xd517ffff pref] > pci 0000:00:01.0: PCI bridge to [bus 01] > pci 0000:00:01.0: bridge window [mem 0xd8000000-0xd8ffffff] > pci 0000:00:01.0: bridge window [mem 0xd5000000-0xd51fffff pref] > > But to be honest, my knowledge as to the possible cause of this is > lacking. Is this just a small ACPI related glitch or is it the sign > ethernet cards are misbehaving somehow? > > Are more people seeing errors in this area? > > Thx and regards, > -- Hans (just trying to make sense of it all) > > > 2016-04-11 22:14 GMT+02:00 Nathan Coulson <nathan at bravenet.com > <mailto:nathan at bravenet.com>>: > > Hello > > We were attempting to use scsi-target-utils, hosted on a xen dom0 > vm using localhost, and running into some problems. I was not > able to reproduce this on a centos 7.2 server using the default > kernel. > > > (From dmesg) > Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: > detected conn error (1022) > Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs > expired, recv timeout 5, last rx 4295253788, last ping 4295258790, > now 4295263808 > Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) > Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 > error (1022 - Invalid or unknown error code) state (3) > Apr 4 11:18:44 funk iscsid: connection2:0 is operational after > recovery (1 attempts) > > Repeated a few times, until eventually > > > Apr 4 11:19:44 funk kernel: Result: > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK > Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: > Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 > Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev > sdd, sector 31442920 > Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] > Apr 4 11:19:44 funk kernel: [ 658.127688] Result: > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK > Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: > Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df > c7 e8 00 00 18 00 > Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: > I/O error, dev sdd, sector 31442920 > Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] > Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] > Apr 4 11:19:44 funk kernel: [ 658.128105] Result: > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK > Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: > Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 > 08 00 00 00 18 00 > Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: > I/O error, dev sdd, sector 2048 > Apr 4 11:19:44 funk kernel: Result: > hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK > Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: > Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 > Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev > sdd, sector 2048 > > > (Test Setup) > scsi-target-utils installed via yum, default config > /etc/tgt/conf.d/xenguests.conf > <target iqn.2016-02.com.bravenet:test> > backing-store //mnt/vmdisk/test # vm image > </target> > > systemctl tgtd restart > > iscsiadm -m discovery -t sendtargets -p localhost > > iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l > > > add it to lvm (pvcreate, vgcreate), let's call it > /dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv> > > and then use libvirt to attempt to install an os on > /dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv> (using anaconda) > > > > > Around the time it tries to create the disk label, is when the > conn errors start, until eventually it gives up trying to create > the disk label. > > > > We tested a similar setup on a centos 7.2 host we use kvm based > virtualmachine hosting on (default 3.10 kernel), and it worked > fine. It may be similar to what was reported on > https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never > saw a resolution on what they discovered (other then a reference > to comment18 which does not appear to exist). > > Testing over the network appears to also work as well (where > another machine connects to scsi-target-utils on the funk server > above. > > > > > > Longterm Purpose of the above setup, was to get direct access to a > filesystem image hosted on a gluster setup, using bs-type glfs on > scsi-target-utils. > > -- > Nathan Coulson > www.bravenet.com <http://www.bravenet.com> > nathan at bravenet.com <mailto:nathan at bravenet.com> > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org <mailto:CentOS-virt at centos.org> > https://lists.centos.org/mailman/listinfo/centos-virt > > > > > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org > https://lists.centos.org/mailman/listinfo/centos-virt -- Nathan Coulson System Administrator for Bravenet www.bravenet.com nathan at bravenet.com