[CentOS] kernel: blk_cloned_rq_check_limits: over max segments limit., Device Mapper Multipath, iBFT, iSCSI COMSTAR

Tue Dec 19 01:10:06 UTC 2017
Tom Robinson <tom.robinson at motec.com.au>

Hi,

WARNING: Long post ahead

I have an issue when starting multipathd. The kernel complains about "blk_cloned_rq_check_limits:
over max segments limit".

The server in question is configured for KVM hosting. It boots via iBFT to an iSCSI volume. Target
is COMSTAR and underlying that is a ZFS volume (100GB). The server also has two infiniband cards
providing four (4) more paths over SRP (SCSI RDMA Protocol). With multipathd in the initramfs and
enabled at boot, I get five paths (but also now the blk_cloned_req_check_limits error) on boot.

I/O on the boot volume also causes the multipaths to fail one after the other until only one path
remains, issuing the kernel message "blk_cloned_rq_check_limits: over max segments limit" for each
failed path. The paths then recover one after the other but have also 'frozen' the system (final
path did not recover!) at which point all I can do is reset the power. (I/O can be KVM Guest
read/write, running "dracut -f", copying an ISO to /var/lib/libvirt/images or 'dd'ing a 1GB file to
/tmp).

I have read:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/pdf/dm_multipath/Red_Hat_Enterprise_Linux-7-DM_Multipath-en-US.pdf

I have read: https://access.redhat.com/solutions/2437991

I don't use GPFS: http://www-01.ibm.com/support/docview.wss?uid=ssg1S1009622

I actually have two servers with the exact same hardware configuration. Originally they were both
CentOS 7 with the default install LVM partitioning structure. For the above host, I have
re-installed and removed LVM as it complicated the issue further. It does appear to be somewhat more
stable but results on I/O testing are now inconsistent. Things that have changed are:

Not using LVM on the host.
Freed up some storage reservations (removed snapshots) on the SAN.

ENVIRONMENT:

CentOS 7.4
System is iBFT boot (boots from iSCSI on 10G card to COMSTAR zfs volume)
System also has Infiniband which provide another four (4) paths
multipath is initialised in initramfs
Kernel is 3.10.0-693.11.1.el7.x86_64

With respect to https://access.redhat.com/solutions/2437991 there is a recommendation to adjust the
'max_sectors_kb' for devices when seeing the kernel error I am encountering. To that end I have
queried the /sys/block area for that information the below.

For debugging, at boot, I have disabled multipathd service

I wrote the below script to query /sys/block for block device information:
 
# cat bin/max_sectors_kb
#!/bin/bash
 
printf "%-18s: %-37s %-15s %-18s\n" $sysblk "Sys Block Node" "Device" "max_sectors_kb"
"max_hw_sectors_kb"
for b in `ls /sys/block/`
do
        sysblk=/sys/block/$b
        ls $sysblk/dm/name > /dev/null 2>&1
        if [ $? -ne 0 ]; then
                vendor=`cat $sysblk/device/vendor`
                model=`cat $sysblk/device/model`
                name="$vendor $model"
        else
                name=`cat $sysblk/dm/name`
        fi
        max_sectors_kb=`cat $sysblk/queue/max_sectors_kb`
        max_hw_sectors_kb=`cat $sysblk/queue/max_hw_sectors_kb`
 
        printf "%-18s: %-37s %-15s %-18s\n" $sysblk "$name" $max_sectors_kb $max_hw_sectors_kb
done

The boot volume is identified by the multipath wwid 3600144f00000000000005a2769c70001. It's
underlying device nodes are: sda, sdd, sde, sdk and sdj.

Running the script after boot for the boot device I get this:
# max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max
Sys Block Node    : Device                                max_sectors_kb  max_hw_sectors_kb
/sys/block/dm-1   : 3600144f00000000000005a2769c70001     512             32767           
/sys/block/dm-5   : 3600144f00000000000005a2769c70001p1   512             32767           
/sys/block/dm-6   : 3600144f00000000000005a2769c70001p2   512             32767           
/sys/block/dm-7   : 3600144f00000000000005a2769c70001p3   512             32767           
/sys/block/sda    : SUN      COMSTAR                      512             32767           
/sys/block/sdd    : SUN      COMSTAR                      512             512             
/sys/block/sde    : SUN      COMSTAR                      512             512             
/sys/block/sdj    : SUN      COMSTAR                      512             512             
/sys/block/sdk    : SUN      COMSTAR                      512             512             

On starting multipathd service manually I get:
Dec 18 12:46:41 lemans systemd: Starting Availability of block devices...
Dec 18 12:46:41 lemans systemd: Starting Device-Mapper Multipath Device Controller...
Dec 18 12:46:41 lemans systemd: Started Availability of block devices.
Dec 18 12:46:41 lemans systemd: Started Device-Mapper Multipath Device Controller.
Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: load table [0 209715200
multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:64 1 8:144 1 8:48 1 8:160 1 8:0 1]
Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a93b80002: load table [0 41943040
multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:80 1 8:176 1 8:96 1 8:192 1 8:16 1]
Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a94510003: load table [0 41943040
multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:112 1 8:208 1 8:128 1 8:224 1 8:32 1]
Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit.
Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:48.
Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Reinstating path 8:48.
Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: event checker started
Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a93b80002: event checker started
Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a94510003: event checker started
Dec 18 12:46:41 lemans multipathd: path checkers start up
Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit.
Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:64.
Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit.
Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:160.
Dec 18 12:46:41 lemans multipathd: sde: mark as failed
Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 4
Dec 18 12:46:41 lemans multipathd: sdk: mark as failed
Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 3
Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: sde - directio checker reports
path is up
Dec 18 12:46:42 lemans kernel: device-mapper: multipath: Reinstating path 8:64.
Dec 18 12:46:42 lemans multipathd: 8:64: reinstated
Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 4
Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: sdk - directio checker reports
path is up
Dec 18 12:46:42 lemans kernel: device-mapper: multipath: Reinstating path 8:160.
Dec 18 12:46:42 lemans multipathd: 8:160: reinstated
Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 5
 
AND also the max_hw_sectors_kb is reduced to 512 for all but one device:
# max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max
Sys Block Node    : Device                                max_sectors_kb  max_hw_sectors_kb
/sys/block/dm-1   : 3600144f00000000000005a2769c70001     512             512             
/sys/block/dm-5   : 3600144f00000000000005a2769c70001p1   512             512             
/sys/block/dm-6   : 3600144f00000000000005a2769c70001p2   512             512             
/sys/block/dm-7   : 3600144f00000000000005a2769c70001p3   512             512             
/sys/block/sda    : SUN      COMSTAR                      512             32767           
/sys/block/sdd    : SUN      COMSTAR                      512             512             
/sys/block/sde    : SUN      COMSTAR                      512             512             
/sys/block/sdj    : SUN      COMSTAR                      512             512             
/sys/block/sdk    : SUN      COMSTAR                      512             512
 
I don't understand the inconsistent manipulation of the max_hw_sectors_kb...

ENABLING multipathd AT BOOT:
Enabling the multipathd service at boot, I get lots of path failures and the
blk_cloned_rq_check_limits error. From dmesg:
[   48.486407] sd 4:0:0:7: [sdo] Attached SCSI disk
[   49.460048] blk_cloned_rq_check_limits: over max segments limit.
[   49.461199] device-mapper: multipath: Failing path 8:48.
[   49.462284] blk_cloned_rq_check_limits: over max segments limit.
[   49.463392] device-mapper: multipath: Failing path 8:144.
[   49.469282] blk_cloned_rq_check_limits: over max segments limit.
[   49.470886] device-mapper: multipath: Failing path 8:176.
[   49.472560] blk_cloned_rq_check_limits: over max segments limit.
[   49.473893] device-mapper: multipath: Failing path 8:64.
[   49.847469] device-mapper: multipath: Reinstating path 8:48.
[   49.850403] device-mapper: multipath: Reinstating path 8:144.
[   49.853143] device-mapper: multipath: Reinstating path 8:176.
[   50.626291] blk_cloned_rq_check_limits: over max segments limit.
[   50.627539] device-mapper: multipath: Failing path 8:144.
[   50.628533] blk_cloned_rq_check_limits: over max segments limit.
[   50.629467] device-mapper: multipath: Failing path 8:176.
[   50.630441] blk_cloned_rq_check_limits: over max segments limit.
[   50.631379] device-mapper: multipath: Failing path 8:48.
[   54.859526] device-mapper: multipath: Reinstating path 8:64.
[   54.860050] device-mapper: multipath: Reinstating path 8:144.
[   54.860463] device-mapper: multipath: Reinstating path 8:176.
[   55.861034] device-mapper: multipath: Reinstating path 8:48.
[   59.811712] UDP: bad checksum. From 192.168.0.124:55913 to 255.255.255.255:15000 ulen 58
[   59.999563] blk_cloned_rq_check_limits: over max segments limit.
[   59.999591] device-mapper: multipath: Failing path 8:48.
[   59.999650] blk_cloned_rq_check_limits: over max segments limit.
[   59.999675] device-mapper: multipath: Failing path 8:144.
[   59.999711] blk_cloned_rq_check_limits: over max segments limit.
[   59.999736] device-mapper: multipath: Failing path 8:64.
[   59.999773] blk_cloned_rq_check_limits: over max segments limit.
[   59.999798] device-mapper: multipath: Failing path 8:176.
[   61.055033] device-mapper: multipath: Reinstating path 8:64.
[   61.087458] blk_cloned_rq_check_limits: over max segments limit.
[   61.087495] device-mapper: multipath: Failing path 8:64.
[   62.077462] device-mapper: multipath: Reinstating path 8:48.
[   66.078675] device-mapper: multipath: Reinstating path 8:144.
[   66.079036] device-mapper: multipath: Reinstating path 8:176.
[   67.079827] device-mapper: multipath: Reinstating path 8:64.
[  115.197811] blk_cloned_rq_check_limits: over max segments limit.
[  115.197876] device-mapper: multipath: Failing path 8:176.
[  115.197952] blk_cloned_rq_check_limits: over max segments limit.
[  115.198007] device-mapper: multipath: Failing path 8:48.
[  115.198063] blk_cloned_rq_check_limits: over max segments limit.
[  115.198117] device-mapper: multipath: Failing path 8:144.
[  121.101670] device-mapper: multipath: Reinstating path 8:48.
[  121.102270] device-mapper: multipath: Reinstating path 8:144.
[  121.102645] device-mapper: multipath: Reinstating path 8:176.
[  205.275512] blk_cloned_rq_check_limits: over max segments limit.
[  205.275577] device-mapper: multipath: Failing path 8:48.
[  205.275695] blk_cloned_rq_check_limits: over max segments limit.
[  205.275752] device-mapper: multipath: Failing path 8:176.
[  205.277268] blk_cloned_rq_check_limits: over max segments limit.
[  205.277333] device-mapper: multipath: Failing path 8:144.
 
Other information at this boot point: Multipath
# multipath -ll 3600144f00000000000005a2769c70001
3600144f00000000000005a2769c70001 dm-1 SUN     ,COMSTAR       
size=100G features='3 queue_if_no_path queue_mode mq' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:0 sda 8:0   active ready running
  |- 2:0:0:0 sdd 8:48  active ready running
  |- 1:0:0:0 sde 8:64  active ready running
  |- 3:0:0:0 sdj 8:144 active ready running
  `- 4:0:0:0 sdl 8:176 active ready running
 
Other information at this boot point: /sys/block info:
# max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max
Sys Block Node    : Device                                max_sectors_kb  max_hw_sectors_kb
/sys/block/dm-1   : 3600144f00000000000005a2769c70001     512             512             
/sys/block/dm-7   : 3600144f00000000000005a2769c70001p1   512             32767           
/sys/block/dm-8   : 3600144f00000000000005a2769c70001p2   512             32767           
/sys/block/dm-9   : 3600144f00000000000005a2769c70001p3   512             32767           
/sys/block/sda    : SUN      COMSTAR                      512             32767           
/sys/block/sdd    : SUN      COMSTAR                      512             512             
/sys/block/sde    : SUN      COMSTAR                      512             512             
/sys/block/sdj    : SUN      COMSTAR                      512             512             
/sys/block/sdk    : SUN      COMSTAR                      512             512             
 
AND when I restart multipathd the max_hw_sectors_kb resets to 512 on all but one device:
# max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max
Sys Block Node    : Device                                max_sectors_kb  max_hw_sectors_kb
/sys/block/dm-1   : 3600144f00000000000005a2769c70001     512             512             
/sys/block/dm-7   : 3600144f00000000000005a2769c70001p1   512             512             
/sys/block/dm-8   : 3600144f00000000000005a2769c70001p2   512             512             
/sys/block/dm-9   : 3600144f00000000000005a2769c70001p3   512             512             
/sys/block/sda    : SUN      COMSTAR                      512             32767           
/sys/block/sdd    : SUN      COMSTAR                      512             512             
/sys/block/sde    : SUN      COMSTAR                      512             512             
/sys/block/sdj    : SUN      COMSTAR                      512             512             
/sys/block/sdk    : SUN      COMSTAR                      512             512             

Setting max_sectors_kb in /etc/multipath.conf gives very inconsistent results. It's not actually
possible to increase the setting past 512 consistently across all available paths as they are
established with different hardware maximums (by kernel? driver?) on boot. And starting multipathd
seems to change the (read only) value of max_hw_sectors_kb. I get alignment issues and unmountable
volumes.

Having removed LVM and tidying up SAN storage looks to have solved some issues. I still get the
errors on boot. The inconsistencies in max_hw_sectors_kb for each path before and after starting
multipathd (both in initramfs and during/after boot) still baffle me.

All comments welcome. Any clues appreciated.

Kind regards,
Tom

-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051   
E: tom.robinson at motec.com.au

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20171219/ad3749c3/attachment-0004.sig>