On 11/17/2012 09:58 PM, Steven Crothers wrote: > On Sat, Nov 17, 2012 at 6:23 PM, Digimer <lists at alteeve.ca > <mailto:lists at alteeve.ca>> wrote: > > You could take two nodes, setup DRBD to replicate the data > (synchronously), manage a floating/virtual IP in pacemaker or rgmanager > and export the DRBD storage as an iSCSI LUN using tgtd. Then you can > migrate to the backup node, take down the primary node for maintenance > and restore with minimal/no downtime. Run this over mode=1 bonding with > each leg on two different switches and you get network HA as well. > > > There is nothing active/active about DRBD though, it also doesn't solve > the problem of trying to utilize two heads. > > It's just failover. Nothing more. > > I'm looking for an active/active failover scenario, to utilize the > multiple physical paths for additional throughput and bandwidth. Yes, I > know I can add more nics. More nics doesn't provide failover of the > physical node. First, you can run DRBD in dual-primary (aka, Active/Active) just fine. It will faithfully replicate in real time and in both directions. Of course, then you need something to synchronize the data at the logical level (DRBD is just a block device), and that is where GFS2 or OCFS2 comes in, though the performance hit will go counter to your goals. You could do multi-path to both nodes, technically, but it's not wise because the cache on the storage can cause problems[1]. Also, you will note that I suggested mode=1, which is Active/Passive bonding, which provides no aggregated bandwidth. This was on purpose; I've tested all modes and *only* mode=1 failed and recovered without interruption reliably. As for failover, if you run DRBD in dual-primary, but keep access through one node at a time only, the only thing that is needed to migrate after the failure of the node that had the IP is to fence the node, take over the IP and start tgtd. This can happen quickly and, in my tests, iSCSI on the clients recovered fine. In my case, I had the LUNs acting as PVs in a clustered LVM with each LV backing a VM. None of the VMs failed or needed to be rebooted. So for what I can gather of your needs, you can get everything you want from open-source. The only caveat is that if you need more speed, you need to beef up your network, not aggregate (for reasons not related to HA), If this is not good enough, then there are plenty of commercial products ready to lighten your wallet by good measure. Digimer 1. http://fghaas.wordpress.com/2011/11/29/dual-primary-drbd-iscsi-and-multipath-dont-do-that/ -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?