Basic shared storage + KVM

List overview All Threads
Download

newer

older

Convert EXT3/EXT4 to XFS

Fencing question(s)

Andrea Chierici

19 Jun 2012 19 Jun '12

4:21 p.m.

Hi, I am trying to set up a shared iscsi storage to serve 6 kvm hypervisors running centos 6.2. I export an LVM from iscsi and configured virt-manager to see the iscsi space as LVM storage (a single storage pool). I can create space on this LVM storage pool directly from virt-manager and I am already running a couple of sample VMs, that do migrate from one hv to the other.

This configuration has a problem: when I create a new LV on the LVM storage pool to host a new VM, the HV where I am creating the virtual machine on sees the LV as status "available", while the others see it as "NOT available". In some circumstances this can crash libvirtd. To fix this I generally issue:

vgchange -an; sleep 1; vgchange -ay

but sometimes this fails with error:

device-mapper; create ioctl failed: Device or resource busy

and anyway it's not very convenient to issue this command on every node every time a new LV is created. Can anyone suggest a solution (if any) to this problem? Keep in mind that the basic concept behind this approach is to keep things as simple as possible. I don't want to configure a cluster or any other complicated tool to simply be able to migrate VMs from one HV to another. Thanks,

Andrea

-- Andrea Chierici - INFN-CNAF Viale Berti Pichat 6/2, 40127 BOLOGNA Office Tel: +39 051 6092809 ICQ#2328798, MSN#ataruz<at>gmail.com, Skype#ataruz --

Show replies by date

Ed Heron

19 Jun 19 Jun

7:48 p.m.

On Tue, 2012-06-19 at 18:21 +0200, Andrea Chierici wrote:

...

Hi, I am trying to set up a shared iscsi storage to serve 6 kvm hypervisors running centos 6.2. I export an LVM from iscsi and configured virt-manager to see the iscsi space as LVM storage (a single storage pool). I can create space on this LVM storage pool directly from virt-manager and I am already running a couple of sample VMs, that do migrate from one hv to the other.

This configuration has a problem: when I create a new LV on the LVM storage pool to host a new VM, the HV where I am creating the virtual machine on sees the LV as status "available", while the others see it as "NOT available". In some circumstances this can crash libvirtd. To fix this I generally issue:

vgchange -an; sleep 1; vgchange -ay

but sometimes this fails with error:

device-mapper; create ioctl failed: Device or resource busy

and anyway it's not very convenient to issue this command on every node every time a new LV is created. Can anyone suggest a solution (if any) to this problem? Keep in mind that the basic concept behind this approach is to keep things as simple as possible. I don't want to configure a cluster or any other complicated tool to simply be able to migrate VMs from one HV to another. Thanks,

Andrea

Please help me understand why you are doing it this way? I'm using Xen with integrated storage, but I've been considering separating my storage from my virtual hosts. Conceptually, we can ignore the Xen/KVM difference for this discussion. I would imagine using LVM on the storage server then setting the LVs up as iSCSI targets. On the virtual host, I imagine I would just configure the new device and hand it to my VM.

Andrea Chierici

9:50 p.m.

Hi,

...

Please help me understand why you are doing it this way? I'm using Xen with integrated storage, but I've been considering separating my storage from my virtual hosts. Conceptually, we can ignore the Xen/KVM difference for this discussion. I would imagine using LVM on the storage server then setting the LVs up as iSCSI targets. On the virtual host, I imagine I would just configure the new device and hand it to my VM.

I am open to any suggestion. I am not really an expert of iscsi, so I don't know what is the best way to implement a solution where a small group of hv support live migration with a shared storage. This way looked rather straightforward and for some level, documented on official redhat manuals. The problem is that there is no mention about this LVM problem :( Initially I tried configuring the raw iscsi device ad storage pool but virt-manager reported it was 100% occupied even if that was not true (indeed 0% was occupied).

Andrea

Philip Durbin

21 Jun 21 Jun

11:29 a.m.

To allow for live migration between hypervisors, I've been using NFS for shared storage of the disk images for each of my virtual machines. Live migration works great, but I'm concerned about performance as I put more and more virtual machines on this infrastructure. The Red Hat docs warn that NFS won't scale in this situation and that iSCSI is preferred.

I'm confused about how to effectively use iSCSI with KVM, however. libvirt can create new disk images all by itself in a storage pool backed by NFS, like I'm using, but libvirt can not create new disk images in a storage pool backed by iSCSI on its own. One must manually create the LUN on the iSCSI storage each time one wants to provision a virtual machine. I like how easy it is to deploy new virtual machines on NFS; I just define the system in Cobbler and kickstart it with koan.

I think my solution to the problem of how to scale shared storage may be OpenStack, which promises this as a feature of Swift. Then, perhaps, I'll be able to leave NFS behind.

I'd be happy to hear more stories of how to scale shared storage while continuing to allow for live migration.

Phil

On Jun 19, 2012, at 5:50 PM, Andrea Chierici andrea.chierici@cnaf.infn.it wrote:

...

Hi,

...
Please help me understand why you are doing it this way? I'm using Xen with integrated storage, but I've been considering separating my storage from my virtual hosts. Conceptually, we can ignore the Xen/KVM difference for this discussion. I would imagine using LVM on the storage server then setting the LVs up as iSCSI targets. On the virtual host, I imagine I would just configure the new device and hand it to my VM.

I am open to any suggestion. I am not really an expert of iscsi, so I don't know what is the best way to implement a solution where a small group of hv support live migration with a shared storage. This way looked rather straightforward and for some level, documented on official redhat manuals. The problem is that there is no mention about this LVM problem :( Initially I tried configuring the raw iscsi device ad storage pool but virt-manager reported it was 100% occupied even if that was not true (indeed 0% was occupied).

Andrea

CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt

Dennis Jacobfeuerborn

4:13 p.m.

On 06/21/2012 01:29 PM, Philip Durbin wrote:

...

To allow for live migration between hypervisors, I've been using NFS for shared storage of the disk images for each of my virtual machines. Live migration works great, but I'm concerned about performance as I put more and more virtual machines on this infrastructure. The Red Hat docs warn that NFS won't scale in this situation and that iSCSI is preferred.

I'm confused about how to effectively use iSCSI with KVM, however. libvirt can create new disk images all by itself in a storage pool backed by NFS, like I'm using, but libvirt can not create new disk images in a storage pool backed by iSCSI on its own. One must manually create the LUN on the iSCSI storage each time one wants to provision a virtual machine. I like how easy it is to deploy new virtual machines on NFS; I just define the system in Cobbler and kickstart it with koan.

I think my solution to the problem of how to scale shared storage may be OpenStack, which promises this as a feature of Swift. Then, perhaps, I'll be able to leave NFS behind.

I'd be happy to hear more stories of how to scale shared storage while continuing to allow for live migration.

AFAIK you cannot use Swift storage as a Nova volume backend. Also in order to make Swift scale you need at least a couple of nodes.

You might want to take a look at ceph.com The offer an object store that can be attached as a block device (like iScsi) but KVM also contains a driver that can directly talk to the storage. Then there is CephFS which is basically a posix filesystem on top of the object store that has some neat features and would be a closer replacement to NFS.

Another thing to look at is http://www.osrg.net/sheepdog/ This is very similar to ceph's object storage approach. Some large scale benchmarks (1000 nodes) can be found here: http://sheepdog.taobao.org/people/zituan/sheepdog1k.html

Then there is http://www.gluster.org/ This is probably the most mature solution but I'm not sure if the architecture will be able to compete against the other solutions in the long run.

Regards, Dennis

Philip Durbin

27 Jun 27 Jun

11:23 a.m.

On 06/21/2012 12:13 PM, Dennis Jacobfeuerborn wrote:

...

AFAIK you cannot use Swift storage as a Nova volume backend. Also in order to make Swift scale you need at least a couple of nodes.

Is this true? I haven't had a chance to dig into this, but I asked my OpenStack guy about this on IRC the other day:

14:48 pdurbin westmaas: is this true? "AFAIK you cannot use Swift storage as a Nova volume backend" -- [CentOS-virt] Basic shared storage + KVM - http://lists.centos.org/pipermail/centos-virt/2012-June/002943.html

14:51 westmaas pdurbin: hm, I'm not 100% sure on that. let me ask around.

14:52 pdurbin westmaas: thanks. i thought the point of swift was that it would take away all my storage problems. :) that swift would handle all the scaling for me

14:54 westmaas all your object storage

14:54 westmaas not necessarily block storage

14:54 westmaas but at the same time, I can't imagine this not being a goal

14:55 pdurbin well, i thought the vm images were abstracted away into objects or whatever

14:55 pdurbin i need to do some reading, obviously

14:55 westmaas yeah, the projects aren't tied that closely together yet.

14:56 pdurbin bummer

14:56 pdurbin agoddard had a great internal reply to that centos-virt thread. about iSCSI options

14:57 pdurbin i don't see him online but i'll have to ask if he minds if i copy and paste his reply back to the list

-- http://irclog.perlgeek.de/crimsonfu/2012-06-25#i_5756369

It looks like I need to dig into this documentation:

Storage: objects, blocks, and files - OpenStack Install and Deploy Manual - Essex - http://docs.openstack.org/essex/openstack-compute/install/yum/content/termin...

If there's other stuff I should be reading, please send me links!

I'm off to the Red Hat Summit the rest of the week and I'll try to ask the OpenStack guys about this.

...

You might want to take a look at ceph.com The offer an object store that can be attached as a block device (like iScsi) but KVM also contains a driver that can directly talk to the storage. Then there is CephFS which is basically a posix filesystem on top of the object store that has some neat features and would be a closer replacement to NFS.

Another thing to look at is http://www.osrg.net/sheepdog/ This is very similar to ceph's object storage approach. Some large scale benchmarks (1000 nodes) can be found here: http://sheepdog.taobao.org/people/zituan/sheepdog1k.html

Then there is http://www.gluster.org/ This is probably the most mature solution but I'm not sure if the architecture will be able to compete against the other solutions in the long run.

These are all good ideas and I need to spend more time reading about them. Thanks.

The main reason I'm writing is that "agoddard" from above gave me permission to copy and paste his thoughts on iSCSI and libvirt. (He isn't subscribed to this mailing list, but I had forwarded what I wrote.) Here is his reply:

"From my understanding, these are the options for iSCSI.. I'd love to hear about it if anyone has thoughts or alternatives :)

1) iSCSI 1 LUN per volume manually

-- provision a LUN manually for a host on the SAN, attach the LUN to libvirt and rock.

Pros: fast storage, reliable, multipathing, live migration should work

Cons: manually configuring the LUN when you deploy the VM (and timing this right with automated tasks that are expecting a disk), running out of LUNs on the SAN, cleaning up orphaned LUNs, etc etc.

2) iSCSI 1 LUN per volume using API

-- provision a LUN for a host on the SAN, using an API to the SAN to orchestrate LUN creation during VM creation, attach the LUN to libvirt and rock.

Pros: fast storage, reliable, multipathing, live migration should work

Cons: the SAN has to have an API, you gotta write and test a client for it, running out of LUNs on the SAN, API also needs to clean up orphaned LUNs.

3) large iSCSI LUN with LVM

-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk

Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM snapshots & disk cloning, familiar tools (no need to write APIs)

Cons: Volume group corruption if multiple hosts modify the group at the same time, or LVM metadata is out of sync between hosts.

4) large iSCSI LUN with CLVM

-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk, use CLVM (clustered LVM) to prevent potential issues with VG corruption

Pros: Fast disk creation, easy to delete disk when deleting VM, familiar tools (no need to write APIs), safeguard against corruption.

Cons: No snapshot support

5) large iSCSI LUN with LVM, with LVM operations managed by a single host

-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk, hand off all LVM operations to a single host, or ensure only a single host is running them at a time.

Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM snapshots & disk cloning, familiar tools, prevent possible corruption by only running potentially conflicting operations on one host.

Cons: Logic for ensuring LVM operations are handed off to one host or otherwise not conflicting needs to be written and baked into provisioning.

6) Provision "system" drives via NFS to VMs, iSCSI LUNs for data

Use NFS for provisioning hosts, only attaching iSCSI LUNs to mount points that require the performance. LUNs would be mounted within the guest, using iSCSI

Pros: Provisioning as easy as NFS, with qcow2, snapshotting and everything still there, only puts the fast storage where it's needed, avoids having to modify provisioning.

Cons: LUNs need to be created for data disks, through API or manually, multiple guests attaching to iSCSI using software iSCSI will have a higher overhead than host attaching, not possible (I think?) to use hardware HBA's to speed up iSCSI.

Best, Ant"

I hope this is helpful.

Phil

4804

Age (days ago)

4812

Last active (days ago)

virt@lists.centos.org

5 comments

4 participants

tags (0)

participants (4)

Andrea Chierici
Dennis Jacobfeuerborn
Ed Heron
Philip Durbin