Hi, up until now I've always deployed VMs with their storage located directly on the host system but as the number of VMs grows and the hardware becomes more powerful and can handle more virtual machines I'm concerned about a failure of the host taking down too many VMs in one go. As a result I'm now looking at moving to an infrastructure that uses shared storage instead so I can live-migrate VMs or restart them quickly on another host if the one they are running on dies. The problem is that I'm not sure how to go about this bandwidth-wise. What I'm aiming for as a starting point is a 3-4 host cluster with about 10 VMs on each host and a 2 system DRBD based cluster as a redundant storage backend. The question that bugs me is how I can get enough bandwidth between the hosts and the storage to provide the VMs with reasonable I/O performance. If all the 40 VMs start copying files at the same time that would mean that the bandwidth share for each VM would be tiny. Granted this is a worst case scenario and that's why I want to ask if someone in here has experience with such a setup, can give recommendations or comment on alternative setups? Would I maybe get away with 4 bonded gbit ethernet ports? Would I require fiber channel or 10gbit infrastructure?
Regards, Dennis
PS: The sheepdog project (http://www.osrg.net/sheepdog/) looks interesting in that regard but apparently still is far from production-ready.