[CentOS-virt] Basic shared storage + KVM

Wed Jun 27 11:23:45 UTC 2012
Philip Durbin <philipdurbin at gmail.com>

On 06/21/2012 12:13 PM, Dennis Jacobfeuerborn wrote:
> AFAIK you cannot use Swift storage as a Nova volume backend. Also in order
> to make Swift scale you need at least a couple of nodes.

Is this true?  I haven't had a chance to dig into this, but I asked my 
OpenStack guy about this on IRC the other day:

14:48 pdurbin  westmaas: is this true? "AFAIK you cannot use Swift 
storage as a Nova volume backend" -- [CentOS-virt] Basic shared storage 
+ KVM - http://lists.centos.org/pipermail/centos-virt/2012-June/002943.html

14:51 westmaas pdurbin: hm, I'm not 100% sure on that. let me ask around.

14:52 pdurbin  westmaas: thanks. i thought the point of swift was that 
it would take away all my storage problems. :) that swift would handle 
all the scaling for me

14:54 westmaas all your object storage

14:54 westmaas not necessarily block storage

14:54 westmaas but at the same time, I can't imagine this not being a goal

14:55 pdurbin  well, i thought the vm images were abstracted away into 
objects or whatever

14:55 pdurbin  i need to do some reading, obviously

14:55 westmaas yeah, the projects aren't tied that closely together yet.

14:56 pdurbin  bummer

14:56 pdurbin  agoddard had a great internal reply to that centos-virt 
thread. about iSCSI options

14:57 pdurbin  i don't see him online but i'll have to ask if he minds 
if i copy and paste his reply back to the list

-- http://irclog.perlgeek.de/crimsonfu/2012-06-25#i_5756369

It looks like I need to dig into this documentation:

Storage: objects, blocks, and files - OpenStack Install and Deploy 
Manual  - Essex - 
http://docs.openstack.org/essex/openstack-compute/install/yum/content/terminology-storage.html

If there's other stuff I should be reading, please send me links!

I'm off to the Red Hat Summit the rest of the week and I'll try to ask 
the OpenStack guys about this.

> You might want to take a look at ceph.com
> The offer an object store that can be attached as a block device (like
> iScsi) but KVM also contains a driver that can directly talk to the storage.
> Then there is CephFS which is basically a posix filesystem on top of the
> object store that has some neat features and would be a closer replacement
> to NFS.
>
> Another thing to look at is http://www.osrg.net/sheepdog/
> This is very similar to ceph's object storage approach.
> Some large scale benchmarks (1000 nodes) can be found here:
> http://sheepdog.taobao.org/people/zituan/sheepdog1k.html
>
> Then there is http://www.gluster.org/
> This is probably the most mature solution but I'm not sure if the
> architecture will be able to compete against the other solutions in the
> long run.

These are all good ideas and I need to spend more time reading about 
them.  Thanks.

The main reason I'm writing is that "agoddard" from above gave me 
permission to copy and paste his thoughts on iSCSI and libvirt.  (He 
isn't subscribed to this mailing list, but I had forwarded what I 
wrote.)  Here is his reply:

"From my understanding, these are the options for iSCSI.. I'd love to 
hear about it if anyone has thoughts or alternatives :)

1) iSCSI 1 LUN per volume manually

-- provision a LUN manually for a host on the SAN, attach the LUN to 
libvirt and rock.

Pros: fast storage, reliable, multipathing, live migration should work

Cons: manually configuring the LUN when you deploy the VM (and timing 
this right with automated tasks that are expecting a disk), running out 
of LUNs on the SAN, cleaning up orphaned LUNs, etc etc.

2) iSCSI 1 LUN per volume using API

-- provision a LUN for a host on the SAN, using an API to the SAN to 
orchestrate LUN creation during VM creation, attach the LUN to libvirt 
and rock.

Pros: fast storage, reliable, multipathing, live migration should work

Cons: the SAN has to have an API, you gotta write and test a client for 
it, running out of LUNs on the SAN, API also needs to clean up orphaned 
LUNs.

3) large iSCSI LUN with LVM

-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk

Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM 
snapshots & disk cloning, familiar tools (no need to write APIs)

Cons: Volume group corruption if multiple hosts modify the group at the 
same time, or LVM metadata is out of sync between hosts.

4) large iSCSI LUN with CLVM

-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk, use CLVM (clustered LVM) to prevent 
potential issues with VG corruption

Pros: Fast disk creation, easy to delete disk when deleting VM, familiar 
tools (no need to write APIs), safeguard against corruption.

Cons: No snapshot support

5) large iSCSI LUN with LVM, with LVM operations managed by a single host

-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk, hand off all LVM operations to a single 
host, or ensure only a single host is running them  at a time.

Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM 
snapshots & disk cloning, familiar tools, prevent possible corruption by 
only running potentially conflicting operations on one host.

Cons: Logic for ensuring LVM operations are handed off to one host or 
otherwise not conflicting needs to be written and baked into provisioning.

6) Provision "system" drives via NFS to VMs, iSCSI LUNs for data

Use NFS for provisioning hosts, only attaching iSCSI LUNs to mount 
points that require the performance. LUNs would be mounted within the 
guest, using iSCSI

Pros: Provisioning as easy as NFS, with qcow2, snapshotting and 
everything still there, only puts the fast storage where it's needed, 
avoids having to modify provisioning.

Cons: LUNs need to be created for data disks, through API or manually, 
multiple guests attaching to iSCSI using software iSCSI will have a 
higher overhead than host attaching, not possible (I think?) to use 
hardware HBA's to speed up iSCSI.

Best,
Ant"

I hope this is helpful.

Phil