[CentOS-virt] Problems with scsi-target-utils when hosted on dom0 centos 7 xen box
Nathan Coulson
nathan at bravenet.com
Fri Apr 15 20:27:08 UTC 2016
On 2016-04-14 07:10 AM, Hans Loots wrote:
> Hello Nathan, dear all,
>
> > We were attempting to use scsi-target-utils, hosted on a xen dom0 vm
> using localhost,
> > and running into some problems. I was not able to reproduce this on
> a centos 7.2
> > server using the default kernel.
>
> I am seeing comparable things on our centos6 xen servers running 3.18
> kernels. We have about 20 of those machines running and have started
> upgrading them from 3.10.68 to 3.18 a couple of weeks ago. But
> currently, at 3/4 of finishing, I'm having second thoughts and am
> thinking about rolling back because of reliability issues.
>
> Stuff I've tried before is taking care that all machine runs latest
> BIOS'es and ethernet firmware. The servers in question are Dell
> PowerEdges from different generations, talking to an Equallogic
> diskarray over 1Gbit copper. Dells toolset is installed, OMSA as well
> as hitkit.
>
> The errors I'm seeing are looking like this:
>
> Apr 13 23:03:43 xen15-2 iscsid: Kernel reported iSCSI connection 25:0
> error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (1)
> Apr 13 23:03:43 xen15-2 iscsid: Connection25:0 to [target: iqn.xxxxx,
> portal: a.b.c.d,3260] through [iface: eql.em2] is operational now
> Apr 13 23:03:48 xen15-2 iscsid: Connection9:0 to [target: iqn.xxxxx,
> portal: a.b.c.d,3260] through [iface: eql.em2] is shutdown.
>
I did not have interface shutdowns in my tests (well, Network Manager
was doing something there but I disabled it for my tests). The hardware
is an old Tyan S2882D motherboard, 8GB Ram, and 2x Opteron 275
processors (Dual Core).
> While the the only noticeable difference in dmesg output is stuff like
> this:
> (on 3.18)
> pci 0000:02:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]:
> no compatible bridge window
> pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]:
> no compatible bridge window
> pci 0000:01:00.0: BAR 6: assigned [mem 0x91e80000-0x91efffff pref]
> pci 0000:01:00.1: BAR 6: no space for [mem size 0x00080000 pref]
> pci 0000:01:00.1: BAR 6: failed to assign [mem size 0x00080000 pref]
> pci 0000:01:00.2: BAR 6: no space for [mem size 0x00080000 pref]
> pci 0000:01:00.2: BAR 6: failed to assign [mem size 0x00080000 pref]
> pci 0000:01:00.3: BAR 6: no space for [mem size 0x00080000 pref]
> pci 0000:01:00.3: BAR 6: failed to assign [mem size 0x00080000 pref]
> (and on 3.10)
> pci 0000:00:03.0: BAR 15: assigned [mem 0xd5200000-0xd53fffff pref]
> pci 0000:01:00.1: BAR 6: assigned [mem 0xd5000000-0xd507ffff pref]
> pci 0000:01:00.2: BAR 6: assigned [mem 0xd5080000-0xd50fffff pref]
> pci 0000:01:00.3: BAR 6: assigned [mem 0xd5100000-0xd517ffff pref]
> pci 0000:00:01.0: PCI bridge to [bus 01]
> pci 0000:00:01.0: bridge window [mem 0xd8000000-0xd8ffffff]
> pci 0000:00:01.0: bridge window [mem 0xd5000000-0xd51fffff pref]
>
> But to be honest, my knowledge as to the possible cause of this is
> lacking. Is this just a small ACPI related glitch or is it the sign
> ethernet cards are misbehaving somehow?
>
> Are more people seeing errors in this area?
>
> Thx and regards,
> -- Hans (just trying to make sense of it all)
>
>
> 2016-04-11 22:14 GMT+02:00 Nathan Coulson <nathan at bravenet.com
> <mailto:nathan at bravenet.com>>:
>
> Hello
>
> We were attempting to use scsi-target-utils, hosted on a xen dom0
> vm using localhost, and running into some problems. I was not
> able to reproduce this on a centos 7.2 server using the default
> kernel.
>
>
> (From dmesg)
> Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0:
> detected conn error (1022)
> Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs
> expired, recv timeout 5, last rx 4295253788, last ping 4295258790,
> now 4295263808
> Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022)
> Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0
> error (1022 - Invalid or unknown error code) state (3)
> Apr 4 11:18:44 funk iscsid: connection2:0 is operational after
> recovery (1 attempts)
>
> Repeated a few times, until eventually
>
>
> Apr 4 11:19:44 funk kernel: Result:
> hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB:
> Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00
> Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev
> sdd, sector 31442920
> Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd]
> Apr 4 11:19:44 funk kernel: [ 658.127688] Result:
> hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB:
> Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df
> c7 e8 00 00 18 00
> Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request:
> I/O error, dev sdd, sector 31442920
> Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd]
> Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd]
> Apr 4 11:19:44 funk kernel: [ 658.128105] Result:
> hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB:
> Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00
> 08 00 00 00 18 00
> Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request:
> I/O error, dev sdd, sector 2048
> Apr 4 11:19:44 funk kernel: Result:
> hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB:
> Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00
> Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev
> sdd, sector 2048
>
>
> (Test Setup)
> scsi-target-utils installed via yum, default config
> /etc/tgt/conf.d/xenguests.conf
> <target iqn.2016-02.com.bravenet:test>
> backing-store //mnt/vmdisk/test # vm image
> </target>
>
> systemctl tgtd restart
>
> iscsiadm -m discovery -t sendtargets -p localhost
>
> iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l
>
>
> add it to lvm (pvcreate, vgcreate), let's call it
> /dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv>
>
> and then use libvirt to attempt to install an os on
> /dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv> (using anaconda)
>
>
>
>
> Around the time it tries to create the disk label, is when the
> conn errors start, until eventually it gives up trying to create
> the disk label.
>
>
>
> We tested a similar setup on a centos 7.2 host we use kvm based
> virtualmachine hosting on (default 3.10 kernel), and it worked
> fine. It may be similar to what was reported on
> https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never
> saw a resolution on what they discovered (other then a reference
> to comment18 which does not appear to exist).
>
> Testing over the network appears to also work as well (where
> another machine connects to scsi-target-utils on the funk server
> above.
>
>
>
>
>
> Longterm Purpose of the above setup, was to get direct access to a
> filesystem image hosted on a gluster setup, using bs-type glfs on
> scsi-target-utils.
>
> --
> Nathan Coulson
> www.bravenet.com <http://www.bravenet.com>
> nathan at bravenet.com <mailto:nathan at bravenet.com>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org <mailto:CentOS-virt at centos.org>
> https://lists.centos.org/mailman/listinfo/centos-virt
>
>
>
>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
--
Nathan Coulson
System Administrator for Bravenet
www.bravenet.com
nathan at bravenet.com
More information about the CentOS-virt
mailing list