Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
What I'm primarily worried about is the case when several VMs cause disk I/O at the same time. One example would be the "updatedb" cronjob of the mlocate package. If you have say 5 VMs running on a physical System with a local software raid-1 as storage and the all run updatedb at the same time that causes all of them to run really slowly because the starve each other fighting over the disk.
What is the best way to soften the impact of such a situation? Does it make sense to use a hardware raid instead? How would the raid type affect the performance in this case? Would the fact that the I/O load gets distributed across multiple spindles in, say, a 4 disk hardware raid-5 have a big impact on this?
I'm currently facing the problem where I fear that random disk I/O by too many VMs on a physical system could cripple their performance even though I have plenty of CPU cores/RAM left to run them.
Has anyone experience with this problem and maybe some data to shed some light on this potential bottleneck for virtualization?
Regards, Dennis
Dennis J. wrote:
Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
What I'm primarily worried about is the case when several VMs cause disk I/O at the same time. One example would be the "updatedb" cronjob of the mlocate package. If you have say 5 VMs running on a physical System with a local software raid-1 as storage and the all run updatedb at the same time that causes all of them to run really slowly because the starve each other fighting over the disk.
In this particular case, I would edit /etc/crontab on each VM and move the update at another time instead of the traditional 4:02
What is the best way to soften the impact of such a situation? Does it make sense to use a hardware raid instead? How would the raid type affect the performance in this case? Would the fact that the I/O load gets distributed across multiple spindles in, say, a 4 disk hardware raid-5 have a big impact on this?
I'm currently facing the problem where I fear that random disk I/O by too many VMs on a physical system could cripple their performance even though I have plenty of CPU cores/RAM left to run them.
Has anyone experience with this problem and maybe some data to shed some light on this potential bottleneck for virtualization?
Regards, Dennis
On 07/27/2009 03:17 PM, Manuel Wolfshant wrote:
Dennis J. wrote:
Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
What I'm primarily worried about is the case when several VMs cause disk I/O at the same time. One example would be the "updatedb" cronjob of the mlocate package. If you have say 5 VMs running on a physical System with a local software raid-1 as storage and the all run updatedb at the same time that causes all of them to run really slowly because the starve each other fighting over the disk.
In this particular case, I would edit /etc/crontab on each VM and move the update at another time instead of the traditional 4:02
My problem with that solution is that it puts assumptions about the host environment into the guest. If for example I migrate that client to another physical host I actually have to remember to modify that entry and maybe other to fit into an appropriate time slot on the new host machine.
I'm aware that you cannot make the pain of 5 VMs hammering the disk in parallel go away completely but I'm interested in finding out what can be done to reduce that pain as much as possible without making the guest too aware of its hosting environment.
Regards, Dennis
I am certainly no expert on Xen. I have read through docs and various threads a bit considering the I/O demands and have the impression that there are a couple of primary factors to work with (please correct me if I'm wrong). My comprehension is far from complete.
1 - Select which scheduler, weight and cap to use. Some favor computation over I/O and vice versa. If your dom's are "fighting" over disk I/O this is the referee and enforces the rules YOU choose to keep it "fair" and efficient as possible.
Suggested read, document date apparently Jul-18-2009: http://cseweb.ucsd.edu/~dgupta/papers/per07-3sched-xen.pdf
Xen 3.3 User's Manual http://bits.xensource.com/Xen/docs/user.pdf
2 - Given that you can throw more RAM and CPU at each dom than it needs (not CPU or memory constrained), I/O hardware is important. Roughly speaking, the more disk controllers in play, the more bandwidth you have. Likewise, the faster the disks and perhaps the bigger their caches the faster they can offload the task. (My experience has shown me that larger built-in disk cache size substantially improves performance on email servers.)
3 - From what I have gleaned on forums and lists there may be a relationship or synergy between the scheduler selected and the type of disk array in use. HYPOTHETICALLY a RAID1 vs. a RAID5 in a given situation may show their best performance from different Xen scheduling.
I don't think there is an "answer" for you that is right without you running tests with your setup and seeing which is best. From what I have read the I/O operation test should be a task that runs longer than 30 seconds. I would test what I think would be the longest possible duration and dd a huge file while hammering the other domUs with simulated "normal" request (e.g. Apache ab running at websites on other domUs).
YMMV and you may not know what to do until an actual issue arise post deployment. Compared Scheduler performance can vary in different Xen releases.
I have not fully investigated changing the scheduling management of weight and cap using cron/atd type management and allocate more resources at certain times to different domUs to help them get things done faster, or conversely to reduce their impact on other domUs when they are scheduled to be I/O intense.
~~~~~~~~~~~~~ Usage: xm sched-credit -d <Domain> [-w[=WEIGHT]|-c[=CAP]]
Get/set credit scheduler parameters. -d DOMAIN, --domain=DOMAIN Domain to modify -w WEIGHT, --weight=WEIGHT Weight (int) -c CAP, --cap=CAP Cap (int) ~~~~~~~~~~~~~
Dennis J. wrote:
On 07/27/2009 03:17 PM, Manuel Wolfshant wrote:
Dennis J. wrote:
Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
What I'm primarily worried about is the case when several VMs cause disk I/O at the same time. One example would be the "updatedb" cronjob of the mlocate package. If you have say 5 VMs running on a physical System with a local software raid-1 as storage and the all run updatedb at the same time that causes all of them to run really slowly because the starve each other fighting over the disk.
In this particular case, I would edit /etc/crontab on each VM and move the update at another time instead of the traditional 4:02
My problem with that solution is that it puts assumptions about the host environment into the guest. If for example I migrate that client to another physical host I actually have to remember to modify that entry and maybe other to fit into an appropriate time slot on the new host machine.
I'm aware that you cannot make the pain of 5 VMs hammering the disk in parallel go away completely but I'm interested in finding out what can be done to reduce that pain as much as possible without making the guest too aware of its hosting environment.
Regards, Dennis _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
On 07/27/2009 05:12 PM, Ben Montanelli wrote:
I am certainly no expert on Xen. I have read through docs and various threads a bit considering the I/O demands and have the impression that there are a couple of primary factors to work with (please correct me if I'm wrong). My comprehension is far from complete.
1 - Select which scheduler, weight and cap to use. Some favor computation over I/O and vice versa. If your dom's are "fighting" over disk I/O this is the referee and enforces the rules YOU choose to keep it "fair" and efficient as possible.
Suggested read, document date apparently Jul-18-2009: http://cseweb.ucsd.edu/~dgupta/papers/per07-3sched-xen.pdf
Thanks for that. I'll probably move most of my Xen VMs over to KVM as soon as that becomes a viable option but this should help me getting my bearings with regards to understanding the overall scheduling and I/O topics.
Regards, Dennis
On 07/27/2009 02:15 PM, Dennis J. wrote:
Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
have you looked at :
http://sourceforge.net/apps/trac/ioband/
On 07/27/2009 04:53 PM, Karanbir Singh wrote:
On 07/27/2009 02:15 PM, Dennis J. wrote:
Hi, What is the best way to deal with I/O load when running several VMs on a physical machine with local or remote storage?
have you looked at :
Yes, I've taken a look at that but before I get to the tuning on the software side I want to get a feel for the options and their impact on the hardware side.
Regards, Dennis