On Thu, Jul 5, 2018 at 7:10 PM, Digimer lists@alteeve.ca wrote:
First of all thanks for all your answers, all useful in a way or another. I have yet to dig sufficiently deep in Warren considerations, but I will do it, I promise! Very interesting arguments The concerns of Alexander are true in an ideal world, but when your role is to be an IT Consultant and you are not responsible for the budget and for the department, is not so easy to convince about emerging concepts that due to their nature are not so rock solid and accepted (yet). In my work lifetime I had the fortune to be on both the sides of the IT chair and so I think I'm able to see all the points. Eg in 2004 I was the IT Manager of a small company (without responsibility of the budget, I had to convince my CEO at that time; company revenue about 50 million euros) and I did migrate the physical environment to VMware and a Dell CX300 SAN, but it was not so easy, believe in me. I left the company at end of 2007 and the same untouched 3-years old environment ran for other 4 years without any modification or problems. And bare on me, at least in Italy in 2004 it wasn't a so common environment to setup for production.
I always prioritize simplicity and isolation, so I vote fore 2x 2-nodes.
There is no effective benefit to 3+ nodes (quorum is arguably helpful, but proper stonith, which you need anyway, makes it mostly a moot point).
In this particular scenario I run several Oracle RDBMS instances. They are
currently distributed as 3 big ones on the first cluster and other 7 smaller on the other one. With chance to grow up. So in my case I think I can spread better in my opinion the load and have better high availability.
Keep in mind; If your services are critical enough to justify an HA cluster, they're probably important enough that adding the complexity/overhead of larger clusters doesn't offset any hardware efficiency savings.
Probably true for old RHCS stack, based on Cman/Rgmanager. But from various tests it seems Corosync/Pacemaker is much more smooth in managing more than 2 nodes' clusters
Lastly, with 2x 2-node, you could lose two nodes (one per cluster) and still be operational. If you lose 2 nodes of a four node cluster, you're offline.
This is true with default configuration, but you can configure Auto Tie Breaker (ATB) as you can see with "man votequorum", or an example web page here: https://www.systutorials.com/docs/linux/man/5-votequorum/
I just tested and verified it on my virtual 4-nodes based on CentOS 7.4, where I have:
- modified corosync.conf on all nodes - pcs cluster stop --all - pcs cluster start --all - wait a few minutes for resources to start - shutdown cl3 and cl4
and this is the situation at the end, without downtime and with cluster quorate
[root@cl1 ~]# pcs status Cluster name: clorarhv1 Stack: corosync Current DC: intracl2 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum Last updated: Sat Jul 7 15:25:47 2018 Last change: Thu Jul 5 18:09:52 2018 by root via crm_resource on intracl2
4 nodes configured 15 resources configured
Online: [ intracl1 intracl2 ] OFFLINE: [ intracl3 intracl4 ]
Full list of resources:
Resource Group: DB1 LV_DB1_APPL (ocf::heartbeat:LVM): Started intracl1 DB1_APPL (ocf::heartbeat:Filesystem): Started intracl1 LV_DB1_CTRL (ocf::heartbeat:LVM): Started intracl1 LV_DB1_DATA (ocf::heartbeat:LVM): Started intracl1 LV_DB1_RDOF (ocf::heartbeat:LVM): Started intracl1 LV_DB1_REDO (ocf::heartbeat:LVM): Started intracl1 LV_DB1_TEMP (ocf::heartbeat:LVM): Started intracl1 DB1_CTRL (ocf::heartbeat:Filesystem): Started intracl1 DB1_DATA (ocf::heartbeat:Filesystem): Started intracl1 DB1_RDOF (ocf::heartbeat:Filesystem): Started intracl1 DB1_REDO (ocf::heartbeat:Filesystem): Started intracl1 DB1_TEMP (ocf::heartbeat:Filesystem): Started intracl1 VIP_DB1 (ocf::heartbeat:IPaddr2): Started intracl1 oracledb_DB1 (ocf::heartbeat:oracle): Started intracl1 oralsnr_DB1 (ocf::heartbeat:oralsnr): Started intracl1
Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@cl1 ~]#
This 4 node scenario seems also suitable to my needs because every one of the current 2-nodes clusters is a stretched one, with one node in site A and the other in site B. The future scenario will see 2 nodes of the new cluster in site A and 2 nodes in site B, so that a failure of a site will compromise 2 nodes, but with the setting above I can provide all the 10 RDBMS services spread between two nodes but allowing me to decide where to put them and not force to only a single node.
BTW: there is also last_man_standing option I can set so that I can also tolerate the loss ot site B and while not yet resolved, the loss of one of the two surviving nodes in site A (in this case possibly I will disable some less critical services or tolerate degraded performances)
In this case the configuration in my case would be (not tested yet):
quorum { provider: corosync_votequorum expected_votes: 4 last_man_standing: 1 auto_tie_breaker: 1 }
Note that it is applied only on node loss and not when a node leaves the cluster in a clean state, for which there is the "allow_downscale" option that seems not fully supported at this moment.
Cheers, Gianluca