Hi,
WARNING: Long post ahead
I have an issue when starting multipathd. The kernel complains about "blk_cloned_rq_check_limits: over max segments limit".
The server in question is configured for KVM hosting. It boots via iBFT to an iSCSI volume. Target is COMSTAR and underlying that is a ZFS volume (100GB). The server also has two infiniband cards providing four (4) more paths over SRP (SCSI RDMA Protocol). With multipathd in the initramfs and enabled at boot, I get five paths (but also now the blk_cloned_req_check_limits error) on boot.
I/O on the boot volume also causes the multipaths to fail one after the other until only one path remains, issuing the kernel message "blk_cloned_rq_check_limits: over max segments limit" for each failed path. The paths then recover one after the other but have also 'frozen' the system (final path did not recover!) at which point all I can do is reset the power. (I/O can be KVM Guest read/write, running "dracut -f", copying an ISO to /var/lib/libvirt/images or 'dd'ing a 1GB file to /tmp).
I have read: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/pdf...
I have read: https://access.redhat.com/solutions/2437991
I don't use GPFS: http://www-01.ibm.com/support/docview.wss?uid=ssg1S1009622
I actually have two servers with the exact same hardware configuration. Originally they were both CentOS 7 with the default install LVM partitioning structure. For the above host, I have re-installed and removed LVM as it complicated the issue further. It does appear to be somewhat more stable but results on I/O testing are now inconsistent. Things that have changed are:
Not using LVM on the host. Freed up some storage reservations (removed snapshots) on the SAN.
ENVIRONMENT:
CentOS 7.4 System is iBFT boot (boots from iSCSI on 10G card to COMSTAR zfs volume) System also has Infiniband which provide another four (4) paths multipath is initialised in initramfs Kernel is 3.10.0-693.11.1.el7.x86_64
With respect to https://access.redhat.com/solutions/2437991 there is a recommendation to adjust the 'max_sectors_kb' for devices when seeing the kernel error I am encountering. To that end I have queried the /sys/block area for that information the below.
For debugging, at boot, I have disabled multipathd service
I wrote the below script to query /sys/block for block device information: # cat bin/max_sectors_kb #!/bin/bash printf "%-18s: %-37s %-15s %-18s\n" $sysblk "Sys Block Node" "Device" "max_sectors_kb" "max_hw_sectors_kb" for b in `ls /sys/block/` do sysblk=/sys/block/$b ls $sysblk/dm/name > /dev/null 2>&1 if [ $? -ne 0 ]; then vendor=`cat $sysblk/device/vendor` model=`cat $sysblk/device/model` name="$vendor $model" else name=`cat $sysblk/dm/name` fi max_sectors_kb=`cat $sysblk/queue/max_sectors_kb` max_hw_sectors_kb=`cat $sysblk/queue/max_hw_sectors_kb` printf "%-18s: %-37s %-15s %-18s\n" $sysblk "$name" $max_sectors_kb $max_hw_sectors_kb done
The boot volume is identified by the multipath wwid 3600144f00000000000005a2769c70001. It's underlying device nodes are: sda, sdd, sde, sdk and sdj.
Running the script after boot for the boot device I get this: # max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max Sys Block Node : Device max_sectors_kb max_hw_sectors_kb /sys/block/dm-1 : 3600144f00000000000005a2769c70001 512 32767 /sys/block/dm-5 : 3600144f00000000000005a2769c70001p1 512 32767 /sys/block/dm-6 : 3600144f00000000000005a2769c70001p2 512 32767 /sys/block/dm-7 : 3600144f00000000000005a2769c70001p3 512 32767 /sys/block/sda : SUN COMSTAR 512 32767 /sys/block/sdd : SUN COMSTAR 512 512 /sys/block/sde : SUN COMSTAR 512 512 /sys/block/sdj : SUN COMSTAR 512 512 /sys/block/sdk : SUN COMSTAR 512 512
On starting multipathd service manually I get: Dec 18 12:46:41 lemans systemd: Starting Availability of block devices... Dec 18 12:46:41 lemans systemd: Starting Device-Mapper Multipath Device Controller... Dec 18 12:46:41 lemans systemd: Started Availability of block devices. Dec 18 12:46:41 lemans systemd: Started Device-Mapper Multipath Device Controller. Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: load table [0 209715200 multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:64 1 8:144 1 8:48 1 8:160 1 8:0 1] Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a93b80002: load table [0 41943040 multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:80 1 8:176 1 8:96 1 8:192 1 8:16 1] Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a94510003: load table [0 41943040 multipath 1 queue_if_no_path 0 1 1 service-time 0 5 1 8:112 1 8:208 1 8:128 1 8:224 1 8:32 1] Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit. Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:48. Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Reinstating path 8:48. Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: event checker started Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a93b80002: event checker started Dec 18 12:46:41 lemans multipathd: 3600144f0000000000000524a94510003: event checker started Dec 18 12:46:41 lemans multipathd: path checkers start up Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit. Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:64. Dec 18 12:46:41 lemans kernel: blk_cloned_rq_check_limits: over max segments limit. Dec 18 12:46:41 lemans kernel: device-mapper: multipath: Failing path 8:160. Dec 18 12:46:41 lemans multipathd: sde: mark as failed Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 4 Dec 18 12:46:41 lemans multipathd: sdk: mark as failed Dec 18 12:46:41 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 3 Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: sde - directio checker reports path is up Dec 18 12:46:42 lemans kernel: device-mapper: multipath: Reinstating path 8:64. Dec 18 12:46:42 lemans multipathd: 8:64: reinstated Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 4 Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: sdk - directio checker reports path is up Dec 18 12:46:42 lemans kernel: device-mapper: multipath: Reinstating path 8:160. Dec 18 12:46:42 lemans multipathd: 8:160: reinstated Dec 18 12:46:42 lemans multipathd: 3600144f00000000000005a2769c70001: remaining active paths: 5 AND also the max_hw_sectors_kb is reduced to 512 for all but one device: # max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max Sys Block Node : Device max_sectors_kb max_hw_sectors_kb /sys/block/dm-1 : 3600144f00000000000005a2769c70001 512 512 /sys/block/dm-5 : 3600144f00000000000005a2769c70001p1 512 512 /sys/block/dm-6 : 3600144f00000000000005a2769c70001p2 512 512 /sys/block/dm-7 : 3600144f00000000000005a2769c70001p3 512 512 /sys/block/sda : SUN COMSTAR 512 32767 /sys/block/sdd : SUN COMSTAR 512 512 /sys/block/sde : SUN COMSTAR 512 512 /sys/block/sdj : SUN COMSTAR 512 512 /sys/block/sdk : SUN COMSTAR 512 512 I don't understand the inconsistent manipulation of the max_hw_sectors_kb...
ENABLING multipathd AT BOOT: Enabling the multipathd service at boot, I get lots of path failures and the blk_cloned_rq_check_limits error. From dmesg: [ 48.486407] sd 4:0:0:7: [sdo] Attached SCSI disk [ 49.460048] blk_cloned_rq_check_limits: over max segments limit. [ 49.461199] device-mapper: multipath: Failing path 8:48. [ 49.462284] blk_cloned_rq_check_limits: over max segments limit. [ 49.463392] device-mapper: multipath: Failing path 8:144. [ 49.469282] blk_cloned_rq_check_limits: over max segments limit. [ 49.470886] device-mapper: multipath: Failing path 8:176. [ 49.472560] blk_cloned_rq_check_limits: over max segments limit. [ 49.473893] device-mapper: multipath: Failing path 8:64. [ 49.847469] device-mapper: multipath: Reinstating path 8:48. [ 49.850403] device-mapper: multipath: Reinstating path 8:144. [ 49.853143] device-mapper: multipath: Reinstating path 8:176. [ 50.626291] blk_cloned_rq_check_limits: over max segments limit. [ 50.627539] device-mapper: multipath: Failing path 8:144. [ 50.628533] blk_cloned_rq_check_limits: over max segments limit. [ 50.629467] device-mapper: multipath: Failing path 8:176. [ 50.630441] blk_cloned_rq_check_limits: over max segments limit. [ 50.631379] device-mapper: multipath: Failing path 8:48. [ 54.859526] device-mapper: multipath: Reinstating path 8:64. [ 54.860050] device-mapper: multipath: Reinstating path 8:144. [ 54.860463] device-mapper: multipath: Reinstating path 8:176. [ 55.861034] device-mapper: multipath: Reinstating path 8:48. [ 59.811712] UDP: bad checksum. From 192.168.0.124:55913 to 255.255.255.255:15000 ulen 58 [ 59.999563] blk_cloned_rq_check_limits: over max segments limit. [ 59.999591] device-mapper: multipath: Failing path 8:48. [ 59.999650] blk_cloned_rq_check_limits: over max segments limit. [ 59.999675] device-mapper: multipath: Failing path 8:144. [ 59.999711] blk_cloned_rq_check_limits: over max segments limit. [ 59.999736] device-mapper: multipath: Failing path 8:64. [ 59.999773] blk_cloned_rq_check_limits: over max segments limit. [ 59.999798] device-mapper: multipath: Failing path 8:176. [ 61.055033] device-mapper: multipath: Reinstating path 8:64. [ 61.087458] blk_cloned_rq_check_limits: over max segments limit. [ 61.087495] device-mapper: multipath: Failing path 8:64. [ 62.077462] device-mapper: multipath: Reinstating path 8:48. [ 66.078675] device-mapper: multipath: Reinstating path 8:144. [ 66.079036] device-mapper: multipath: Reinstating path 8:176. [ 67.079827] device-mapper: multipath: Reinstating path 8:64. [ 115.197811] blk_cloned_rq_check_limits: over max segments limit. [ 115.197876] device-mapper: multipath: Failing path 8:176. [ 115.197952] blk_cloned_rq_check_limits: over max segments limit. [ 115.198007] device-mapper: multipath: Failing path 8:48. [ 115.198063] blk_cloned_rq_check_limits: over max segments limit. [ 115.198117] device-mapper: multipath: Failing path 8:144. [ 121.101670] device-mapper: multipath: Reinstating path 8:48. [ 121.102270] device-mapper: multipath: Reinstating path 8:144. [ 121.102645] device-mapper: multipath: Reinstating path 8:176. [ 205.275512] blk_cloned_rq_check_limits: over max segments limit. [ 205.275577] device-mapper: multipath: Failing path 8:48. [ 205.275695] blk_cloned_rq_check_limits: over max segments limit. [ 205.275752] device-mapper: multipath: Failing path 8:176. [ 205.277268] blk_cloned_rq_check_limits: over max segments limit. [ 205.277333] device-mapper: multipath: Failing path 8:144. Other information at this boot point: Multipath # multipath -ll 3600144f00000000000005a2769c70001 3600144f00000000000005a2769c70001 dm-1 SUN ,COMSTAR size=100G features='3 queue_if_no_path queue_mode mq' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:0 sda 8:0 active ready running |- 2:0:0:0 sdd 8:48 active ready running |- 1:0:0:0 sde 8:64 active ready running |- 3:0:0:0 sdj 8:144 active ready running `- 4:0:0:0 sdl 8:176 active ready running Other information at this boot point: /sys/block info: # max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max Sys Block Node : Device max_sectors_kb max_hw_sectors_kb /sys/block/dm-1 : 3600144f00000000000005a2769c70001 512 512 /sys/block/dm-7 : 3600144f00000000000005a2769c70001p1 512 32767 /sys/block/dm-8 : 3600144f00000000000005a2769c70001p2 512 32767 /sys/block/dm-9 : 3600144f00000000000005a2769c70001p3 512 32767 /sys/block/sda : SUN COMSTAR 512 32767 /sys/block/sdd : SUN COMSTAR 512 512 /sys/block/sde : SUN COMSTAR 512 512 /sys/block/sdj : SUN COMSTAR 512 512 /sys/block/sdk : SUN COMSTAR 512 512 AND when I restart multipathd the max_hw_sectors_kb resets to 512 on all but one device: # max_sectors_kb | grep -e 3600144f00000000000005a2769c70001 -e sda -e sdd -e sde -e sdk -e sdj -e max Sys Block Node : Device max_sectors_kb max_hw_sectors_kb /sys/block/dm-1 : 3600144f00000000000005a2769c70001 512 512 /sys/block/dm-7 : 3600144f00000000000005a2769c70001p1 512 512 /sys/block/dm-8 : 3600144f00000000000005a2769c70001p2 512 512 /sys/block/dm-9 : 3600144f00000000000005a2769c70001p3 512 512 /sys/block/sda : SUN COMSTAR 512 32767 /sys/block/sdd : SUN COMSTAR 512 512 /sys/block/sde : SUN COMSTAR 512 512 /sys/block/sdj : SUN COMSTAR 512 512 /sys/block/sdk : SUN COMSTAR 512 512
Setting max_sectors_kb in /etc/multipath.conf gives very inconsistent results. It's not actually possible to increase the setting past 512 consistently across all available paths as they are established with different hardware maximums (by kernel? driver?) on boot. And starting multipathd seems to change the (read only) value of max_hw_sectors_kb. I get alignment issues and unmountable volumes.
Having removed LVM and tidying up SAN storage looks to have solved some issues. I still get the errors on boot. The inconsistencies in max_hw_sectors_kb for each path before and after starting multipathd (both in initramfs and during/after boot) still baffle me.
All comments welcome. Any clues appreciated.
Kind regards, Tom