[CentOS] Multipathing with Sun 7310

> Dear list,
>
> we have a relatively new Sun Storage 7310, where we connect CentOS 5.5
> Servers (IBM LS21/LS41 Blades) via Brocade Switches, 4GBit FC. The Blades
> boot from SAN via qla2xxx, and have no harddisks at all. We want them to
> use multipathing from the very beginning, so /boot and / are already seen
> by multipathd. Problem is, that the Sun 7310 has two storage heads which
> run in active/passive mode. BUT the multipathd thinks, they are
> active/active and therefor shows half the available paths as faulty
> (multipath -ll below)
> While this probably gives me the redundancy that is desired, it is a
> relatively messy situation, since it will be unnecessary hard to detect
> real path failures and the OS is complaining about "readsector0 checker
> reports path is down" which gives me >40M/24h /var/log/messages garbage.
> Any hints for a reasonable configuration? Unfortunately the Sun 7310 is
> rather new, so almost nothing shows up on google... even less for
> RHEL/CentOS :-(
>
> regards from Berlin
> Jens
>
> [root at dev-db1 tmp]# multipath -ll
> sdaa: checker msg is "readsector0 checker reports path is down"
> sdab: checker msg is "readsector0 checker reports path is down"
> sdac: checker msg is "readsector0 checker reports path is down"
> sdad: checker msg is "readsector0 checker reports path is down"
> sdd: checker msg is "readsector0 checker reports path is down"
> sdh: checker msg is "readsector0 checker reports path is down"
> sdl: checker msg is "readsector0 checker reports path is down"
> sdp: checker msg is "readsector0 checker reports path is down"
> sdq: checker msg is "readsector0 checker reports path is down"
> sdr: checker msg is "readsector0 checker reports path is down"
> sds: checker msg is "readsector0 checker reports path is down"
> sdt: checker msg is "readsector0 checker reports path is down"
> sdu: checker msg is "readsector0 checker reports path is down"
> sdv: checker msg is "readsector0 checker reports path is down"
> sdx: checker msg is "readsector0 checker reports path is down"
> sdz: checker msg is "readsector0 checker reports path is down"
> mpath0 (3600144f0fdf58b5c00004bc738070001) dm-0 SUN,Sun Storage 7310
> [size=50G][features=0][hwhandler=0][rw]
> \_ round-robin 0 [prio=1][active]
>  \_ 0:0:1:0 sda  8:0    [active][ready]
> \_ round-robin 0 [prio=1][enabled]
>  \_ 1:0:0:0 sde  8:64   [active][ready]
> \_ round-robin 0 [prio=1][enabled]
>  \_ 0:0:2:0 sdi  8:128  [active][ready]
> \_ round-robin 0 [prio=1][enabled]
>  \_ 1:0:1:0 sdm  8:192  [active][ready]
> \_ round-robin 0 [prio=0][enabled]
>  \_ 0:0:3:0 sdq  65:0   [failed][faulty]
> \_ round-robin 0 [prio=0][enabled]
>  \_ 1:0:2:0 sdr  65:16  [failed][faulty]
> \_ round-robin 0 [prio=0][enabled]
>  \_ 0:0:4:0 sdx  65:112 [failed][faulty]
> \_ round-robin 0 [prio=0][enabled]
>  \_ 1:0:3:0 sdz  65:144 [failed][faulty]

Hello Jens,

looks like your multipathing setup is using multibus path_grouping_policy.
Did you create a custom /etc/multipath.conf? You should have. As your
device is new and not known with proper defaults by the
device-mapper-multipath you will have to set the path_grouping_policy
explicitly.

devices {
         device {
                 vendor "SUN"
                 product "Sun Storage 7310"    <== REMARK
                 path_grouping_policy failover
                 getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                 prio_callout "/sbin/mpath_prio_rdac /dev/%n"
                 features "0"
                 hardware_handler "1 rdac"
                 path_grouping_policy group_by_prio
                 failback immediate
                 rr_weight uniform
                 no_path_retry queue
                 rr_min_io 1000
                 path_checker rdac
         }
}

REMARK: to detect the proper product name issue for instance:

          cat /sys/block/sda/device/model

Restart the multipathd after editing the multipath.conf and I expect
`multipath -ll" or `multipathd -k"show paths"' will show you the paths
correctly weighted.

Regards

Alexander