Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

List overview All Threads
Download

newer

older

Re: [CentOS-virt] migrating from...

migrating from xend to libxl after...

Nathan Coulson

11 Apr 2016 11 Apr '16

8:14 p.m.

Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

(From dmesg) Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: detected conn error (1022) Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295253788, last ping 4295258790, now 4295263808 Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 error (1022 - Invalid or unknown error code) state (3) Apr 4 11:18:44 funk iscsid: connection2:0 is operational after recovery (1 attempts)

Repeated a few times, until eventually

Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.127688] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.128105] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: I/O error, dev sdd, sector 2048 Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 2048

(Test Setup) scsi-target-utils installed via yum, default config /etc/tgt/conf.d/xenguests.conf <target iqn.2016-02.com.bravenet:test> backing-store //mnt/vmdisk/test # vm image </target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l

add it to lvm (pvcreate, vgcreate), let's call it /dev/vmdisk.vg/test.lv

and then use libvirt to attempt to install an os on /dev/vmdisk.vg/test.lv (using anaconda)

Around the time it tries to create the disk label, is when the conn errors start, until eventually it gives up trying to create the disk label.

We tested a similar setup on a centos 7.2 host we use kvm based virtualmachine hosting on (default 3.10 kernel), and it worked fine. It may be similar to what was reported on https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never saw a resolution on what they discovered (other then a reference to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where another machine connects to scsi-target-utils on the funk server above.

Longterm Purpose of the above setup, was to get direct access to a filesystem image hosted on a gluster setup, using bs-type glfs on scsi-target-utils.

-- Nathan Coulson www.bravenet.com nathan@bravenet.com

Show replies by date

George Dunlap

12 Apr 12 Apr

10:26 a.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On Mon, Apr 11, 2016 at 9:14 PM, Nathan Coulson nathan@bravenet.com wrote:

...

Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

Have you tried booting the Virt SIG kernel natively and seeing if you can reproduce the problem at all?

Thanks, -George

...

(From dmesg) Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: detected conn error (1022) Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295253788, last ping 4295258790, now 4295263808 Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 error (1022 - Invalid or unknown error code) state (3) Apr 4 11:18:44 funk iscsid: connection2:0 is operational after recovery (1 attempts)

Repeated a few times, until eventually

Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.127688] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.128105] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: I/O error, dev sdd, sector 2048 Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 2048

(Test Setup) scsi-target-utils installed via yum, default config /etc/tgt/conf.d/xenguests.conf

<target iqn.2016-02.com.bravenet:test> backing-store //mnt/vmdisk/test # vm image </target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l

add it to lvm (pvcreate, vgcreate), let's call it /dev/vmdisk.vg/test.lv

and then use libvirt to attempt to install an os on /dev/vmdisk.vg/test.lv (using anaconda)

Around the time it tries to create the disk label, is when the conn errors start, until eventually it gives up trying to create the disk label.

We tested a similar setup on a centos 7.2 host we use kvm based virtualmachine hosting on (default 3.10 kernel), and it worked fine. It may be similar to what was reported on https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never saw a resolution on what they discovered (other then a reference to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where another machine connects to scsi-target-utils on the funk server above.

Longterm Purpose of the above setup, was to get direct access to a filesystem image hosted on a gluster setup, using bs-type glfs on scsi-target-utils.

-- Nathan Coulson www.bravenet.com nathan@bravenet.com _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

Nathan Coulson

4:43 p.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

By natively, I take it using kernel /vmlinuz (vs kernel /xen)

Not yet, but working on setting up such an environment.

(At this time, I was using virt-install to reproduce the problem, and the original server we are testing on did not support kvm but the 2nd server does).

On 2016-04-12 03:26 AM, George Dunlap wrote:

...

On Mon, Apr 11, 2016 at 9:14 PM, Nathan Coulson nathan@bravenet.com wrote:

...
Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

Have you tried booting the Virt SIG kernel natively and seeing if you can reproduce the problem at all?

Thanks, -George

...
(From dmesg) Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: detected conn error (1022) Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295253788, last ping 4295258790, now 4295263808 Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 error (1022 - Invalid or unknown error code) state (3) Apr 4 11:18:44 funk iscsid: connection2:0 is operational after recovery (1 attempts)

Repeated a few times, until eventually

Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.127688] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.128105] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: I/O error, dev sdd, sector 2048 Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 2048

(Test Setup) scsi-target-utils installed via yum, default config /etc/tgt/conf.d/xenguests.conf

<target iqn.2016-02.com.bravenet:test> backing-store //mnt/vmdisk/test # vm image </target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l

add it to lvm (pvcreate, vgcreate), let's call it /dev/vmdisk.vg/test.lv

and then use libvirt to attempt to install an os on /dev/vmdisk.vg/test.lv (using anaconda)

Around the time it tries to create the disk label, is when the conn errors start, until eventually it gives up trying to create the disk label.

We tested a similar setup on a centos 7.2 host we use kvm based virtualmachine hosting on (default 3.10 kernel), and it worked fine. It may be similar to what was reported on https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never saw a resolution on what they discovered (other then a reference to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where another machine connects to scsi-target-utils on the funk server above.

Longterm Purpose of the above setup, was to get direct access to a filesystem image hosted on a gluster setup, using bs-type glfs on scsi-target-utils.

-- Nathan Coulson www.bravenet.com nathan@bravenet.com _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

-- Nathan Coulson System Administrator for Bravenet www.bravenet.com nathan@bravenet.com

Nathan Coulson

13 Apr 13 Apr

12:18 a.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On 2016-04-12 09:43 AM, Nathan Coulson wrote:

...

By natively, I take it using kernel /vmlinuz (vs kernel /xen)

Not yet, but working on setting up such an environment.

(At this time, I was using virt-install to reproduce the problem, and the original server we are testing on did not support kvm but the 2nd server does).

On 2016-04-12 03:26 AM, George Dunlap wrote:

...
On Mon, Apr 11, 2016 at 9:14 PM, Nathan Coulson nathan@bravenet.com wrote:

...
Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

Have you tried booting the Virt SIG kernel natively and seeing if you can reproduce the problem at all?

Thanks, -George

(Apologies for the earlier top post)

Running the kernel natively, on 3.10 or 3.18 (kernel from virt sig) * CentOS Linux (3.18.25-19.el7.x86_64) 7 (Core)) * CentOS Linux (3.10.0-327.13.1.el7.x86_64) 7 (Core)

It works as expected with no issues.

But when booting as dom0 using the xen hypervisor * CentOS Linux, with Xen hypervisor

tested with dom0_mem=2048M,max:2048M (as well as 1024M)

the problems occur as I have describe.

...

...
...
(From dmesg) Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: detected conn error (1022) Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295253788, last ping 4295258790, now 4295263808 Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 error (1022 - Invalid or unknown error code) state (3) Apr 4 11:18:44 funk iscsid: connection2:0 is operational after recovery (1 attempts)

Repeated a few times, until eventually

Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.127688] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.128105] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: I/O error, dev sdd, sector 2048 Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 2048

(Test Setup) scsi-target-utils installed via yum, default config /etc/tgt/conf.d/xenguests.conf

<target iqn.2016-02.com.bravenet:test> backing-store //mnt/vmdisk/test # vm image </target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l

add it to lvm (pvcreate, vgcreate), let's call it /dev/vmdisk.vg/test.lv

and then use libvirt to attempt to install an os on /dev/vmdisk.vg/test.lv (using anaconda)

Around the time it tries to create the disk label, is when the conn errors start, until eventually it gives up trying to create the disk label.

We tested a similar setup on a centos 7.2 host we use kvm based virtualmachine hosting on (default 3.10 kernel), and it worked fine. It may be similar to what was reported on https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never saw a resolution on what they discovered (other then a reference to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where another machine connects to scsi-target-utils on the funk server above.

Longterm Purpose of the above setup, was to get direct access to a filesystem image hosted on a gluster setup, using bs-type glfs on scsi-target-utils.

-- Nathan Coulson www.bravenet.com nathan@bravenet.com _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

-- Nathan Coulson www.bravenet.com nathan@bravenet.com

Sarah Newman

12:53 a.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On 04/12/2016 05:18 PM, Nathan Coulson wrote:

...

(Apologies for the earlier top post)

Running the kernel natively, on 3.10 or 3.18 (kernel from virt sig)

CentOS Linux (3.18.25-19.el7.x86_64) 7 (Core))

CentOS Linux (3.10.0-327.13.1.el7.x86_64) 7 (Core)

It works as expected with no issues.

But when booting as dom0 using the xen hypervisor

CentOS Linux, with Xen hypervisor

tested with dom0_mem=2048M,max:2048M (as well as 1024M)

the problems occur as I have describe.

This probably isn't it, but there was a problem I was having where the workaround was to give the dom0 4096M+ so that 64 bit references were used instead of 32 bit. You could try that and see what happens.

--Sarah

Nathan Coulson

1:31 a.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On 2016-04-12 05:53 PM, Sarah Newman wrote:

...

On 04/12/2016 05:18 PM, Nathan Coulson wrote:

...
(Apologies for the earlier top post)

Running the kernel natively, on 3.10 or 3.18 (kernel from virt sig)

CentOS Linux (3.18.25-19.el7.x86_64) 7 (Core))

CentOS Linux (3.10.0-327.13.1.el7.x86_64) 7 (Core)

It works as expected with no issues.

But when booting as dom0 using the xen hypervisor

CentOS Linux, with Xen hypervisor

tested with dom0_mem=2048M,max:2048M (as well as 1024M)

the problems occur as I have describe.

This probably isn't it, but there was a problem I was having where the workaround was to give the dom0 4096M+ so that 64 bit references were used instead of 32 bit. You could try that and see what happens.

--Sarah

CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

Unfortunately no luck there (Tested with 4608MB), but thank you.

-- Nathan Coulson System Administrator for Bravenet www.bravenet.com nathan@bravenet.com

George Dunlap

14 Apr 14 Apr

2:44 p.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On Wed, Apr 13, 2016 at 1:18 AM, Nathan Coulson nathan@bravenet.com wrote:

...

On 2016-04-12 09:43 AM, Nathan Coulson wrote:

...
By natively, I take it using kernel /vmlinuz (vs kernel /xen)

Not yet, but working on setting up such an environment.

(At this time, I was using virt-install to reproduce the problem, and the original server we are testing on did not support kvm but the 2nd server does).

On 2016-04-12 03:26 AM, George Dunlap wrote:

...
On Mon, Apr 11, 2016 at 9:14 PM, Nathan Coulson nathan@bravenet.com wrote:

...
Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

Have you tried booting the Virt SIG kernel natively and seeing if you can reproduce the problem at all?

Thanks, -George

(Apologies for the earlier top post)

Running the kernel natively, on 3.10 or 3.18 (kernel from virt sig)

CentOS Linux (3.18.25-19.el7.x86_64) 7 (Core))

CentOS Linux (3.10.0-327.13.1.el7.x86_64) 7 (Core)

It works as expected with no issues.

But when booting as dom0 using the xen hypervisor

CentOS Linux, with Xen hypervisor

tested with dom0_mem=2048M,max:2048M (as well as 1024M)

the problems occur as I have describe.

Thanks Nathan. It looks then like this may be an issue with upstream Xen, then. Would you be willing to re-post this bug report (with the information that it works running the same kernel on native) to xen-users? I can make sure it's seen by the appropriate kernel-side developers if necessary.

Thanks, -George

Nathan Coulson

18 Apr 18 Apr

4:19 p.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On 2016-04-14 07:44 AM, George Dunlap wrote:

...

On Wed, Apr 13, 2016 at 1:18 AM, Nathan Coulson nathan@bravenet.com wrote:

...
On 2016-04-12 09:43 AM, Nathan Coulson wrote:

...
By natively, I take it using kernel /vmlinuz (vs kernel /xen)

Not yet, but working on setting up such an environment.

(At this time, I was using virt-install to reproduce the problem, and the original server we are testing on did not support kvm but the 2nd server does).

On 2016-04-12 03:26 AM, George Dunlap wrote:

...
On Mon, Apr 11, 2016 at 9:14 PM, Nathan Coulson nathan@bravenet.com wrote:

...
Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

Have you tried booting the Virt SIG kernel natively and seeing if you can reproduce the problem at all?

Thanks, -George

(Apologies for the earlier top post)

Running the kernel natively, on 3.10 or 3.18 (kernel from virt sig)

CentOS Linux (3.18.25-19.el7.x86_64) 7 (Core))

CentOS Linux (3.10.0-327.13.1.el7.x86_64) 7 (Core)

It works as expected with no issues.

But when booting as dom0 using the xen hypervisor

CentOS Linux, with Xen hypervisor

tested with dom0_mem=2048M,max:2048M (as well as 1024M)

the problems occur as I have describe.

Thanks Nathan. It looks then like this may be an issue with upstream Xen, then. Would you be willing to re-post this bug report (with the information that it works running the same kernel on native) to xen-users? I can make sure it's seen by the appropriate kernel-side developers if necessary.

Thanks, -George

Thank you George, I have sent a revised copy to them as well http://lists.xen.org/archives/html/xen-users/2016-04/msg00058.html

-- Nathan Coulson System Administrator for Bravenet www.bravenet.com nathan@bravenet.com

Hans Loots

14 Apr 14 Apr

2:10 p.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

Hello Nathan, dear all,

...

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm

using localhost,

...

and running into some problems. I was not able to reproduce this on a

centos 7.2

...

server using the default kernel.

I am seeing comparable things on our centos6 xen servers running 3.18 kernels. We have about 20 of those machines running and have started upgrading them from 3.10.68 to 3.18 a couple of weeks ago. But currently, at 3/4 of finishing, I'm having second thoughts and am thinking about rolling back because of reliability issues.

Stuff I've tried before is taking care that all machine runs latest BIOS'es and ethernet firmware. The servers in question are Dell PowerEdges from different generations, talking to an Equallogic diskarray over 1Gbit copper. Dells toolset is installed, OMSA as well as hitkit.

The errors I'm seeing are looking like this:

Apr 13 23:03:43 xen15-2 iscsid: Kernel reported iSCSI connection 25:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (1) Apr 13 23:03:43 xen15-2 iscsid: Connection25:0 to [target: iqn.xxxxx, portal: a.b.c.d,3260] through [iface: eql.em2] is operational now Apr 13 23:03:48 xen15-2 iscsid: Connection9:0 to [target: iqn.xxxxx, portal: a.b.c.d,3260] through [iface: eql.em2] is shutdown.

While the the only noticeable difference in dmesg output is stuff like this: (on 3.18) pci 0000:02:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window pci 0000:01:00.0: BAR 6: assigned [mem 0x91e80000-0x91efffff pref] pci 0000:01:00.1: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.1: BAR 6: failed to assign [mem size 0x00080000 pref] pci 0000:01:00.2: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.2: BAR 6: failed to assign [mem size 0x00080000 pref] pci 0000:01:00.3: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.3: BAR 6: failed to assign [mem size 0x00080000 pref] (and on 3.10) pci 0000:00:03.0: BAR 15: assigned [mem 0xd5200000-0xd53fffff pref] pci 0000:01:00.1: BAR 6: assigned [mem 0xd5000000-0xd507ffff pref] pci 0000:01:00.2: BAR 6: assigned [mem 0xd5080000-0xd50fffff pref] pci 0000:01:00.3: BAR 6: assigned [mem 0xd5100000-0xd517ffff pref] pci 0000:00:01.0: PCI bridge to [bus 01] pci 0000:00:01.0: bridge window [mem 0xd8000000-0xd8ffffff] pci 0000:00:01.0: bridge window [mem 0xd5000000-0xd51fffff pref]

But to be honest, my knowledge as to the possible cause of this is lacking. Is this just a small ACPI related glitch or is it the sign ethernet cards are misbehaving somehow?

Are more people seeing errors in this area?

Thx and regards, -- Hans (just trying to make sense of it all)

2016-04-11 22:14 GMT+02:00 Nathan Coulson nathan@bravenet.com:

...

Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0 vm using localhost, and running into some problems. I was not able to reproduce this on a centos 7.2 server using the default kernel.

(From dmesg) Apr 4 11:18:42 funk kernel: [ 596.511204] connection2:0: detected conn error (1022) Apr 4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295253788, last ping 4295258790, now 4295263808 Apr 4 11:18:42 funk kernel: connection2:0: detected conn error (1022) Apr 4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0 error (1022 - Invalid or unknown error code) state (3) Apr 4 11:18:44 funk iscsid: connection2:0 is operational after recovery (1 attempts)

Repeated a few times, until eventually

Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.127596] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.127688] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.127761] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.127826] Write(10): 2a 00 01 df c7 e8 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.127927] blk_update_request: I/O error, dev sdd, sector 31442920 Apr 4 11:19:44 funk kernel: [ 658.128040] sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] Apr 4 11:19:44 funk kernel: [ 658.128105] Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: [ 658.128177] sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: [ 658.128241] Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: [ 658.128339] blk_update_request: I/O error, dev sdd, sector 2048 Apr 4 11:19:44 funk kernel: Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK Apr 4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB: Apr 4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00 Apr 4 11:19:44 funk kernel: blk_update_request: I/O error, dev sdd, sector 2048

(Test Setup) scsi-target-utils installed via yum, default config /etc/tgt/conf.d/xenguests.conf

<target iqn.2016-02.com.bravenet:test> backing-store //mnt/vmdisk/test # vm image </target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l

add it to lvm (pvcreate, vgcreate), let's call it /dev/vmdisk.vg/test.lv

and then use libvirt to attempt to install an os on /dev/vmdisk.vg/test.lv (using anaconda)

Around the time it tries to create the disk label, is when the conn errors start, until eventually it gives up trying to create the disk label.

We tested a similar setup on a centos 7.2 host we use kvm based virtualmachine hosting on (default 3.10 kernel), and it worked fine. It may be similar to what was reported on https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never saw a resolution on what they discovered (other then a reference to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where another machine connects to scsi-target-utils on the funk server above.

Longterm Purpose of the above setup, was to get direct access to a filesystem image hosted on a gluster setup, using bs-type glfs on scsi-target-utils.

-- Nathan Coulson www.bravenet.com nathan@bravenet.com _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

Nathan Coulson

15 Apr 15 Apr

8:27 p.m.

New subject: Problems with scsi-target-utils when hosted on dom0 centos 7 xen box

On 2016-04-14 07:10 AM, Hans Loots wrote:

...

Hello Nathan, dear all,

...
We were attempting to use scsi-target-utils, hosted on a xen dom0 vm

using localhost,

...
and running into some problems. I was not able to reproduce this on

a centos 7.2

...
server using the default kernel.

I am seeing comparable things on our centos6 xen servers running 3.18 kernels. We have about 20 of those machines running and have started upgrading them from 3.10.68 to 3.18 a couple of weeks ago. But currently, at 3/4 of finishing, I'm having second thoughts and am thinking about rolling back because of reliability issues.

Stuff I've tried before is taking care that all machine runs latest BIOS'es and ethernet firmware. The servers in question are Dell PowerEdges from different generations, talking to an Equallogic diskarray over 1Gbit copper. Dells toolset is installed, OMSA as well as hitkit.

The errors I'm seeing are looking like this:

Apr 13 23:03:43 xen15-2 iscsid: Kernel reported iSCSI connection 25:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (1) Apr 13 23:03:43 xen15-2 iscsid: Connection25:0 to [target: iqn.xxxxx, portal: a.b.c.d,3260] through [iface: eql.em2] is operational now Apr 13 23:03:48 xen15-2 iscsid: Connection9:0 to [target: iqn.xxxxx, portal: a.b.c.d,3260] through [iface: eql.em2] is shutdown.

I did not have interface shutdowns in my tests (well, Network Manager was doing something there but I disabled it for my tests). The hardware is an old Tyan S2882D motherboard, 8GB Ram, and 2x Opteron 275 processors (Dual Core).

...

While the the only noticeable difference in dmesg output is stuff like this: (on 3.18) pci 0000:02:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window pci 0000:01:00.0: BAR 6: assigned [mem 0x91e80000-0x91efffff pref] pci 0000:01:00.1: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.1: BAR 6: failed to assign [mem size 0x00080000 pref] pci 0000:01:00.2: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.2: BAR 6: failed to assign [mem size 0x00080000 pref] pci 0000:01:00.3: BAR 6: no space for [mem size 0x00080000 pref] pci 0000:01:00.3: BAR 6: failed to assign [mem size 0x00080000 pref] (and on 3.10) pci 0000:00:03.0: BAR 15: assigned [mem 0xd5200000-0xd53fffff pref] pci 0000:01:00.1: BAR 6: assigned [mem 0xd5000000-0xd507ffff pref] pci 0000:01:00.2: BAR 6: assigned [mem 0xd5080000-0xd50fffff pref] pci 0000:01:00.3: BAR 6: assigned [mem 0xd5100000-0xd517ffff pref] pci 0000:00:01.0: PCI bridge to [bus 01] pci 0000:00:01.0: bridge window [mem 0xd8000000-0xd8ffffff] pci 0000:00:01.0: bridge window [mem 0xd5000000-0xd51fffff pref]

But to be honest, my knowledge as to the possible cause of this is lacking. Is this just a small ACPI related glitch or is it the sign ethernet cards are misbehaving somehow?

Are more people seeing errors in this area?

Thx and regards, -- Hans (just trying to make sense of it all)

2016-04-11 22:14 GMT+02:00 Nathan Coulson <nathan@bravenet.com mailto:nathan@bravenet.com>:
Hello

We were attempting to use scsi-target-utils, hosted on a xen dom0
vm using localhost, and running into some problems.  I was not
able to reproduce this on a centos 7.2 server using the default
kernel.


(From dmesg)
Apr  4 11:18:42 funk kernel: [  596.511204]  connection2:0:
detected conn error (1022)
Apr  4 11:18:42 funk kernel: connection2:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4295253788, last ping 4295258790,
now 4295263808
Apr  4 11:18:42 funk kernel: connection2:0: detected conn error (1022)
Apr  4 11:18:42 funk iscsid: Kernel reported iSCSI connection 2:0
error (1022 - Invalid or unknown error code) state (3)
Apr  4 11:18:44 funk iscsid: connection2:0 is operational after
recovery (1 attempts)

Repeated a few times, until eventually


Apr  4 11:19:44 funk kernel: Result:
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Apr  4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB:
Apr  4 11:19:44 funk kernel: Write(10): 2a 00 01 df c7 e8 00 00 18 00
Apr  4 11:19:44 funk kernel: blk_update_request: I/O error, dev
sdd, sector 31442920
Apr  4 11:19:44 funk kernel: [  658.127596] sd 7:0:0:1: [sdd]
Apr  4 11:19:44 funk kernel: [  658.127688] Result:
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Apr  4 11:19:44 funk kernel: [  658.127761] sd 7:0:0:1: [sdd] CDB:
Apr  4 11:19:44 funk kernel: [  658.127826] Write(10): 2a 00 01 df
c7 e8 00 00 18 00
Apr  4 11:19:44 funk kernel: [  658.127927] blk_update_request:
I/O error, dev sdd, sector 31442920
Apr  4 11:19:44 funk kernel: [  658.128040] sd 7:0:0:1: [sdd]
Apr  4 11:19:44 funk kernel: sd 7:0:0:1: [sdd]
Apr  4 11:19:44 funk kernel: [  658.128105] Result:
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Apr  4 11:19:44 funk kernel: [  658.128177] sd 7:0:0:1: [sdd] CDB:
Apr  4 11:19:44 funk kernel: [  658.128241] Write(10): 2a 00 00 00
08 00 00 00 18 00
Apr  4 11:19:44 funk kernel: [  658.128339] blk_update_request:
I/O error, dev sdd, sector 2048
Apr  4 11:19:44 funk kernel: Result:
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Apr  4 11:19:44 funk kernel: sd 7:0:0:1: [sdd] CDB:
Apr  4 11:19:44 funk kernel: Write(10): 2a 00 00 00 08 00 00 00 18 00
Apr  4 11:19:44 funk kernel: blk_update_request: I/O error, dev
sdd, sector 2048


(Test Setup)
scsi-target-utils installed via yum, default config
/etc/tgt/conf.d/xenguests.conf
<target iqn.2016-02.com.bravenet:test>
    backing-store //mnt/vmdisk/test # vm image
</target>

systemctl tgtd restart

iscsiadm -m discovery -t sendtargets -p localhost

iscsiadm -m node -T iqn.2016-02.com.bravenet:test -l


add it to lvm (pvcreate, vgcreate), let's call it
/dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv>

and then use libvirt to attempt to install an os on
/dev/vmdisk.vg/test.lv <http://vmdisk.vg/test.lv> (using anaconda)




Around the time it tries to create the disk label, is when the
conn errors start, until eventually it gives up trying to create
the disk label.



We tested a similar setup on a centos 7.2 host we use kvm based
virtualmachine hosting on (default 3.10 kernel), and it worked
fine.  It may be similar to what was reported on
https://bugzilla.redhat.com/show_bug.cgi?id=1245990, but I never
saw a resolution on what they discovered (other then a reference
to comment18 which does not appear to exist).

Testing over the network appears to also work as well (where
another machine connects to scsi-target-utils on the funk server
above.





Longterm Purpose of the above setup, was to get direct access to a
filesystem image hosted on a gluster setup, using bs-type glfs on
scsi-target-utils.

-- 
Nathan Coulson
www.bravenet.com <http://www.bravenet.com>
nathan@bravenet.com <mailto:nathan@bravenet.com>
_______________________________________________
CentOS-virt mailing list
CentOS-virt@centos.org <mailto:CentOS-virt@centos.org>
https://lists.centos.org/mailman/listinfo/centos-virt
CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt

-- Nathan Coulson System Administrator for Bravenet www.bravenet.com nathan@bravenet.com

3611

Age (days ago)

3618

Last active (days ago)

virt@lists.centos.org

9 comments

4 participants

tags (0)

participants (4)

George Dunlap
Hans Loots
Nathan Coulson
Sarah Newman