Hello. We've started a virtualisation project and got stuck in one moment.
Currently we are using the following:
Intel 2312WPQJR as a node
Intel R2312GL4GS as a storage with Intel Infiniband 2 ports controller
Infiniband Mellanox SwitchX IS5023 for commutation.
The nodes run CentOS 6.5 with built-in Infiniband package (Linux v0002
2.6.32-431.el6.x86_64), the storage - CentOS 6.4 also built-in drivers
(Linux stor1.colocat.ru 2.6.32-279.el6.x86_64).
On the storage is made an array, that is shown in system as /storage/s01.
Then it is exported via NFS. The nodes connect to NFS by:
/bin/mount -t nfs -o
rdma,port=20049,rw,hard,timeo=600,retrans=5,async,nfsvers=3,intr
192.168.1.1:/storage/s01 /home/storage/sata/01
mount shows:
192.168.1.1:/storage/s01 on /home/storage/sata/01 type nfs
(rw,rdma,port=20049,hard,timeo=600,retrans=5,nfsvers=3,intr,addr=192.168.1.1)
Then we create a virtual machine with virsh with a disk bus virtio.
All is OK, until we don't start Windows on KVM. It may work for 2 hours or
2 days, but under heavy load it hangs the mount (i.e. /sata/02 and 03 are
accessible, but requesting 01 will result in a total hang of console).
This can be beaten only by hardware reset of the node.
If we mount without rdma - all is fine.
What can we do in such case? If any debugs or info needed - please ask.
Best regards, Nikolay.