[CentOS] iSCSI best practices

Mon Dec 12 14:43:55 UTC 2011
Digimer <linux at alteeve.com>

On 12/12/2011 09:13 AM, Rudi Ahlers wrote:
> On Mon, Dec 12, 2011 at 3:52 PM, Reindl Harald <h.reindl at thelounge.net> wrote:
>>
>>
>> Am 12.12.2011 14:49, schrieb lhecking at users.sourceforge.net:
>>>
>>>> Outage is one thing, but having the disk volumes disappear mid-transaction can be detrimental to a file system's health.
>>>
>>>  To get this back on-topic and closer to the OP's requests, are there any
>>>  particular iscsi settings one should consider to increase resiliency and
>>>  minimise the impact of e.g. a rebooting switch? timeout settings? The
>>>  big disadvantage of iscsi is that you add another layer that can fail
>>>  (compared to having virtual machine images on a local disk).
>>
>> you should always have two links to your iSCSI device and two
>> different switches so that it does not matter if one switch
>> dies or reboots
>>
>>
> 
> 
> And then you still have the iSCSI applicance / server to worry about.
> It can fail as well. Even with redundancy PSU's it could fail - the
> RAM, CPU, motherboard, controller card, expensive RAID card, etc can
> fail as well.

I handle this by setting up two servers running DRBD in active/active
with a simple two-node red hat cluster managing a floating IP address.
The storage network link uses a simple Active/Passive (mode=1) bond with
either link go to separate switches.

I've been able to export the SAN to a second cluster, managing the SAN
space using clustered LVM, to back VMs. I can live-migrate the VMs
between nodes and totally fail the primary SAN and the backup will pick
up seemlessly, VMs don't need to reboot.

This offers a substantially less expensive HA option than some
commercial HA SAN solutions and avoids the headaches of multipath (which
only makes the link redundant, not the SAN itself).

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron