Dear list,
we have a relatively new Sun Storage 7310, where we connect CentOS 5.5 Servers (IBM LS21/LS41 Blades) via Brocade Switches, 4GBit FC. The Blades boot from SAN via qla2xxx, and have no harddisks at all. We want them to use multipathing from the very beginning, so /boot and / are already seen by multipathd. Problem is, that the Sun 7310 has two storage heads which run in active/passive mode. BUT the multipathd thinks, they are active/active and therefor shows half the available paths as faulty (multipath -ll below) While this probably gives me the redundancy that is desired, it is a relatively messy situation, since it will be unnecessary hard to detect real path failures and the OS is complaining about "readsector0 checker reports path is down" which gives me >40M/24h /var/log/messages garbage. Any hints for a reasonable configuration? Unfortunately the Sun 7310 is rather new, so almost nothing shows up on google... even less for RHEL/CentOS :-(
regards from Berlin Jens
[root@dev-db1 tmp]# multipath -ll sdaa: checker msg is "readsector0 checker reports path is down" sdab: checker msg is "readsector0 checker reports path is down" sdac: checker msg is "readsector0 checker reports path is down" sdad: checker msg is "readsector0 checker reports path is down" sdd: checker msg is "readsector0 checker reports path is down" sdh: checker msg is "readsector0 checker reports path is down" sdl: checker msg is "readsector0 checker reports path is down" sdp: checker msg is "readsector0 checker reports path is down" sdq: checker msg is "readsector0 checker reports path is down" sdr: checker msg is "readsector0 checker reports path is down" sds: checker msg is "readsector0 checker reports path is down" sdt: checker msg is "readsector0 checker reports path is down" sdu: checker msg is "readsector0 checker reports path is down" sdv: checker msg is "readsector0 checker reports path is down" sdx: checker msg is "readsector0 checker reports path is down" sdz: checker msg is "readsector0 checker reports path is down" mpath0 (3600144f0fdf58b5c00004bc738070001) dm-0 SUN,Sun Storage 7310 [size=50G][features=0][hwhandler=0][rw] _ round-robin 0 [prio=1][active] _ 0:0:1:0 sda 8:0 [active][ready] _ round-robin 0 [prio=1][enabled] _ 1:0:0:0 sde 8:64 [active][ready] _ round-robin 0 [prio=1][enabled] _ 0:0:2:0 sdi 8:128 [active][ready] _ round-robin 0 [prio=1][enabled] _ 1:0:1:0 sdm 8:192 [active][ready] _ round-robin 0 [prio=0][enabled] _ 0:0:3:0 sdq 65:0 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 1:0:2:0 sdr 65:16 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 0:0:4:0 sdx 65:112 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 1:0:3:0 sdz 65:144 [failed][faulty]
Hello Jens,
looks like your multipathing setup is using multibus path_grouping_policy. Did you create a custom /etc/multipath.conf? You should have. As your device is new and not known with proper defaults by the device-mapper-multipath you will have to set the path_grouping_policy explicitly.
devices { device { vendor "SUN" product "Sun Storage 7310" <== REMARK path_grouping_policy failover getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout "/sbin/mpath_prio_rdac /dev/%n" features "0" hardware_handler "1 rdac" path_grouping_policy group_by_prio failback immediate rr_weight uniform no_path_retry queue rr_min_io 1000 path_checker rdac } }
REMARK: to detect the proper product name issue for instance:
cat /sys/block/sda/device/model
Restart the multipathd after editing the multipath.conf and I expect `multipath -ll" or `multipathd -k"show paths"' will show you the paths correctly weighted.
Regards
Alexander
Hi Alexander,
thanks for replying, here's my current multipath.conf:
defaults { user_friendly_names yes }
blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^(hd|xvd|vd)[a-z]*" wwid "*" }
blacklist_exceptions { wwid "3600144f0fdf58b5c00004bc738070001"
devices{ device { vendor "SUN" product "Sun Storage 7310" } } }
devices { device { vendor "SUN" product "Sun Storage 7310" path_grouping_policy failover getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout "/sbin/mpath_prio_rdac /dev/%n" features "0" failback immediate rr_weight uniform no_path_retry queue rr_min_io 1000 } }
I added "path_grouping_policy failover" because of your message. I also noticed, you have path_grouping_policy specified twice; is this on purpose? Also, when I activate 'hardware_handler "1 rdac"', the box does not boot any more with some rdac driver error message that I can catch since it scrolls by fast...
with the above multipath.conf it gets even stranger:
[root@dev-db1 ~]# multipath -ll sdd: checker msg is "readsector0 checker reports path is down" sdh: checker msg is "readsector0 checker reports path is down" sdi: checker msg is "readsector0 checker reports path is down" sdj: checker msg is "readsector0 checker reports path is down" C9 Inquiry of device </dev/sda> failed. C9 Inquiry of device </dev/sde> failed. mpath0 (3600144f0fdf58b5c00004bc738070001) dm-0 SUN,Sun Storage 7310 [size=50G][features=1 queue_if_no_path][hwhandler=0][rw] _ round-robin 0 [prio=0][active] _ 0:0:1:0 sda 8:0 [active][ready] _ 1:0:0:0 sde 8:64 [active][ready]
however, the system "lives happily", despite the error messages.
Jens Neu Health Services Network Administration
Phone: +49 (0) 30 68905-2412 Mail: jens.neu@biotronik.de
"Alexander Dalloz" ad+lists@uni-x.org Sent by: centos-bounces@centos.org 05/27/2010 04:47 PM Please respond to CentOS mailing list centos@centos.org
To "CentOS mailing list" centos@centos.org cc
Subject Re: [CentOS] Multipathing with Sun 7310
Dear list,
we have a relatively new Sun Storage 7310, where we connect CentOS 5.5 Servers (IBM LS21/LS41 Blades) via Brocade Switches, 4GBit FC. The
Blades
boot from SAN via qla2xxx, and have no harddisks at all. We want them to use multipathing from the very beginning, so /boot and / are already
seen
by multipathd. Problem is, that the Sun 7310 has two storage heads which run in active/passive mode. BUT the multipathd thinks, they are active/active and therefor shows half the available paths as faulty (multipath -ll below) While this probably gives me the redundancy that is desired, it is a relatively messy situation, since it will be unnecessary hard to detect real path failures and the OS is complaining about "readsector0 checker reports path is down" which gives me >40M/24h /var/log/messages garbage. Any hints for a reasonable configuration? Unfortunately the Sun 7310 is rather new, so almost nothing shows up on google... even less for RHEL/CentOS :-(
regards from Berlin Jens
[root@dev-db1 tmp]# multipath -ll sdaa: checker msg is "readsector0 checker reports path is down" sdab: checker msg is "readsector0 checker reports path is down" sdac: checker msg is "readsector0 checker reports path is down" sdad: checker msg is "readsector0 checker reports path is down" sdd: checker msg is "readsector0 checker reports path is down" sdh: checker msg is "readsector0 checker reports path is down" sdl: checker msg is "readsector0 checker reports path is down" sdp: checker msg is "readsector0 checker reports path is down" sdq: checker msg is "readsector0 checker reports path is down" sdr: checker msg is "readsector0 checker reports path is down" sds: checker msg is "readsector0 checker reports path is down" sdt: checker msg is "readsector0 checker reports path is down" sdu: checker msg is "readsector0 checker reports path is down" sdv: checker msg is "readsector0 checker reports path is down" sdx: checker msg is "readsector0 checker reports path is down" sdz: checker msg is "readsector0 checker reports path is down" mpath0 (3600144f0fdf58b5c00004bc738070001) dm-0 SUN,Sun Storage 7310 [size=50G][features=0][hwhandler=0][rw] _ round-robin 0 [prio=1][active] _ 0:0:1:0 sda 8:0 [active][ready] _ round-robin 0 [prio=1][enabled] _ 1:0:0:0 sde 8:64 [active][ready] _ round-robin 0 [prio=1][enabled] _ 0:0:2:0 sdi 8:128 [active][ready] _ round-robin 0 [prio=1][enabled] _ 1:0:1:0 sdm 8:192 [active][ready] _ round-robin 0 [prio=0][enabled] _ 0:0:3:0 sdq 65:0 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 1:0:2:0 sdr 65:16 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 0:0:4:0 sdx 65:112 [failed][faulty] _ round-robin 0 [prio=0][enabled] _ 1:0:3:0 sdz 65:144 [failed][faulty]
Hello Jens,
looks like your multipathing setup is using multibus path_grouping_policy. Did you create a custom /etc/multipath.conf? You should have. As your device is new and not known with proper defaults by the device-mapper-multipath you will have to set the path_grouping_policy explicitly.
devices { device { vendor "SUN" product "Sun Storage 7310" <== REMARK path_grouping_policy failover getuid_callout "/sbin/scsi_id -g -u -s /block/%n" prio_callout "/sbin/mpath_prio_rdac /dev/%n" features "0" hardware_handler "1 rdac" path_grouping_policy group_by_prio failback immediate rr_weight uniform no_path_retry queue rr_min_io 1000 path_checker rdac } }
REMARK: to detect the proper product name issue for instance:
cat /sys/block/sda/device/model
Restart the multipathd after editing the multipath.conf and I expect `multipath -ll" or `multipathd -k"show paths"' will show you the paths correctly weighted.
Regards
Alexander
_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos