[CentOS] CentOS and Dell MD3200i / MD3220i iSCSI w/ multipath -- slightly OT

Peter Gillich pgillich at gmail.com
Fri Jan 28 21:38:59 UTC 2011


Ooops, I missed a citation:

"We normally see these when the target has multiple controllers and one
of the controllers is in passive mode where it would normally be used to
fail over to, or when one controller is being upgraded or when some
management operation is being executed on a controller."
http://groups.google.com/group/open-iscsi/browse_thread/thread/db0ab3907819c48b

Peter

On Fri, Jan 28, 2011 at 22:32, Peter Gillich <pgillich at gmail.com> wrote:
> Hi Ed,
>
> The persistent reservation is a SCSI-3 feature. I'ts useful in a
> cluster environment, where multiple nodes are configured to access a
> device while at the same time blocking access to other nodes.
>
> To disable the iSCSI offload feature, disable the Broadcom iSCSI diver
> (bnx2i), for example:
> echo "install bnx2i /bin/true" > /etc/modprobe.d/blacklist-broadcom
> In this case, only the bnx2 module will be loaded. You can check it by
> the lsmod, modinfo and dmesg. Of course, the processor stress will be
> inceased.
>
> BR,
>
> Peter
>
> On Tue, Jan 25, 2011 at 05:37, Dr. Ed Morbius <dredmorbius at gmail.com> wrote:
>> on 07:48 Sun 23 Jan, Peter Gillich (pgillich at gmail.com) wrote:
>>> Hi,
>>> In last summer, I have had same problems with Dell + CentOS +
>>> multipath combination. For example I/O errors and stability problems
>>> on the initiator machines. The initator machines are (in a Pacemaker
>>> cluster):
>>> - Dell R310
>>> - Broadcom 5709 Gigabit Ethernet card (4-port)
>>> - CentOS 5.4
>>> - 2 Ethernet ports on initiator machines, 2 Ethernet ports in target
>>> machines --> 4 iSCSI pathes by initiators
>>>
>>> Irrespectively of iSCSI, we met the Broadcom MSI-X interrupt problem
>>> (corrected in RHEL/CentOS 5.5). We met more (iSCSI) problems with
>>> Broadcom cards, which are described on a Dell support page:
>>> http://support.dell.com/support/edocs/software/rhel_mn/rhel5_4/en/index.htm
>>
>> Not familiar with this, though we're using Broadcom NICs, four per host
>> for the most part:
>>
>>    01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
>>            Subsystem: Dell PowerEdge R610 BCM5709 Gigabit Ethernet
>>            Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
>>            Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
>>            Latency: 0, Cache Line Size: 64 bytes
>>            Interrupt: pin A routed to IRQ 98
>>            Region 0: Memory at d6000000 (64-bit, non-prefetchable) [size=32M]
>>            Capabilities: [48] Power Management version 3
>>                    Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                    Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>>            Capabilities: [50] Vital Product Data
>>
>> We're bonding two NICs together on each of our core and management nets,
>> iSCSI traffic is on the management net.
>>
>> (VMs are set to use E1000, single interface per subnet).
>>
>>> Since the CentOS is a recompiled RedHat, all RHEL problems and
>>> solutions are true for CentOS ;-)
>>> The Broadcom driver source code is frequently changed. RedHat follows
>>> the Broadcom kernel drivers and iscsi-initiator-utils with some months
>>> latency. CentOS follows the RedHat with some days/weeks/monts.
>>>
>>> Maybe you can find a solution for your problem on a newer Dell support
>>> page: http://support.dell.com/support/edocs/software/rhel_mn/rhel5_5/en/index.htm
>>> Or here:
>>> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/DM_Multipath
>>> http://opensource.marshall.edu/papers/rhel5-iscsi-HOWTO.pdf
>>
>> That Marshall.edu doc looks pretty good.  I'll note that if you're
>> expecting to mount your network devices at boot, having the netdev
>> service running will help (we ran into this issue, repeatedly, thanks to
>> a puppet config ;-).
>>
>>> Some tips:
>>> - I've read somewhere about iSCSI multipath I/O errors, which can be a
>>> normal behaving in a multipath environment at boot time. (?)
>>
>> That has been our experience to date.
>>
>>> - Persistent reservation might be usefult against iSCSI multipath I/O errors.
>>
>> What's persistent reservation?
>>
>>> - Disabling iSCSI offload feature (for example: iSCSI over Broadcom )
>>> and TCP offload feature (for example: NFS over Intel) may be helps.
>>
>> How does one do this / check for this?
>>
>>> - The iSCSI kernel drivers and iscs-initiator-utils must be updated together.
>>
>> We'll keep this in mind.
>>
>>>
>>> Finally, some comments:
>>> - Never use Broadcom GbE card. Intel might be better (mostly)
>>
>> I think we're stuck with 'em.  Dell seems to have been shipping with
>> Broadcom for some years.  Early experiences were horrible, lately it's
>> been getting better, but I'm still leary of the brand.
>>
>>> - The Dell is hardware manufacturer (supplier), not an
>>> OS/driver/utility developer. If you would like to get more support,
>>> you may buy RHEL licenses (with the Dell hardware or from RedHat).
>>> Sometime it's cheaper than taking days for a problem (but sometime
>>> not).
>>
>> Yeah.  We've got a single RH license at this point at it does let us
>> into RH's knowledgebase, though there hasn't been a whole lot there
>> either.
>>
>>> - IBM compiles the latest Broadcom driver if required, see:
>>> http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5073130
>>> - Some Dell hardwares have only x86_64 RedHat certifications See:
>>> https://hardware.redhat.com/show.cgi?id=632145 (R310 + RHEL 6)
>>>
>>> BR,
>>>
>>> Peter
>>
>> Thanks, Peter, very helpful.
>>
>>> On Sat, Jan 22, 2011 at 18:36, Rajagopal Swaminathan
>>> <raju.rajsand at gmail.com> wrote:
>>> >
>>> > Greetings,
>>> >
>>> > On 1/22/11, Edward Morbius <dredmorbius at gmail.com> wrote:
>>> > > CentOS is not a Dell-supported configuration, and we've had little helpful
>>> > > advice from Dell.  There's been some amount of FUD in that Dell don't seem
>>> > > to know what Dell's own software installation (the md3
>>> > >
>>> > > Dell doesn't seem to have much OS experience generally.
>>> > >
>>> >
>>> > +1
>>> >
>>> > It is to be expected from Dell as they outsource support to non-"equal
>>> > opportunity" employers who do not hire support agents beond 40 years
>>> > of age (per HR).
>>> >
>>> > Above fact. below imho
>>> >
>>> > Now, experience often helps reach the source of the problem much
>>> > faster that fast-talking street-smart agents who proliferated.
>>> >
>>> > It is sad that IT industry treats its early community members so callously.
>>> >
>>> > I don't know but Dell seems to be headed the Sun way -- open for
>>> > takeover by HP/IBM
>>> >
>>> > Above imho.
>>> >
>>> > Regards,
>>> >
>>> > Rajagopal
>>> > _______________________________________________
>>> > CentOS mailing list
>>> > CentOS at centos.org
>>> > http://lists.centos.org/mailman/listinfo/centos
>>> _______________________________________________
>>> CentOS mailing list
>>> CentOS at centos.org
>>> http://lists.centos.org/mailman/listinfo/centos
>>
>> --
>> Dr. Ed Morbius
>> Chief Scientist
>> Krell Power Systems Unlimited
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>



More information about the CentOS mailing list