Re: [CentOS-virt] Basic shared storage + KVM

27 Jun 2012


      On 06/21/2012 12:13 PM, Dennis Jacobfeuerborn wrote:
...
AFAIK you cannot use Swift storage as a Nova volume backend. Also in order
to make Swift scale you need at least a couple of nodes.
Is this true?  I haven't had a chance to dig into this, but I asked my 
OpenStack guy about this on IRC the other day:
14:48 pdurbin  westmaas: is this true? "AFAIK you cannot use Swift 
storage as a Nova volume backend" -- [CentOS-virt] Basic shared storage 
+ KVM - http://lists.centos.org/pipermail/centos-virt/2012-June/002943.html
14:51 westmaas pdurbin: hm, I'm not 100% sure on that. let me ask around.
14:52 pdurbin  westmaas: thanks. i thought the point of swift was that 
it would take away all my storage problems. :) that swift would handle 
all the scaling for me
14:54 westmaas all your object storage
14:54 westmaas not necessarily block storage
14:54 westmaas but at the same time, I can't imagine this not being a goal
14:55 pdurbin  well, i thought the vm images were abstracted away into 
objects or whatever
14:55 pdurbin  i need to do some reading, obviously
14:55 westmaas yeah, the projects aren't tied that closely together yet.
14:56 pdurbin  bummer
14:56 pdurbin  agoddard had a great internal reply to that centos-virt 
thread. about iSCSI options
14:57 pdurbin  i don't see him online but i'll have to ask if he minds 
if i copy and paste his reply back to the list
-- http://irclog.perlgeek.de/crimsonfu/2012-06-25#i_5756369
It looks like I need to dig into this documentation:
Storage: objects, blocks, and files - OpenStack Install and Deploy 
Manual  - Essex - 
http://docs.openstack.org/essex/openstack-compute/install/yum/content/termin...
If there's other stuff I should be reading, please send me links!
I'm off to the Red Hat Summit the rest of the week and I'll try to ask 
the OpenStack guys about this.
...
You might want to take a look at ceph.com
The offer an object store that can be attached as a block device (like
iScsi) but KVM also contains a driver that can directly talk to the storage.
Then there is CephFS which is basically a posix filesystem on top of the
object store that has some neat features and would be a closer replacement
to NFS.
Another thing to look at is http://www.osrg.net/sheepdog/
This is very similar to ceph's object storage approach.
Some large scale benchmarks (1000 nodes) can be found here:
http://sheepdog.taobao.org/people/zituan/sheepdog1k.html
Then there is http://www.gluster.org/
This is probably the most mature solution but I'm not sure if the
architecture will be able to compete against the other solutions in the
long run.
These are all good ideas and I need to spend more time reading about 
them.  Thanks.
The main reason I'm writing is that "agoddard" from above gave me 
permission to copy and paste his thoughts on iSCSI and libvirt.  (He 
isn't subscribed to this mailing list, but I had forwarded what I 
wrote.)  Here is his reply:
"From my understanding, these are the options for iSCSI.. I'd love to 
hear about it if anyone has thoughts or alternatives :)
1) iSCSI 1 LUN per volume manually
-- provision a LUN manually for a host on the SAN, attach the LUN to 
libvirt and rock.
Pros: fast storage, reliable, multipathing, live migration should work
Cons: manually configuring the LUN when you deploy the VM (and timing 
this right with automated tasks that are expecting a disk), running out 
of LUNs on the SAN, cleaning up orphaned LUNs, etc etc.
2) iSCSI 1 LUN per volume using API
-- provision a LUN for a host on the SAN, using an API to the SAN to 
orchestrate LUN creation during VM creation, attach the LUN to libvirt 
and rock.
Pros: fast storage, reliable, multipathing, live migration should work
Cons: the SAN has to have an API, you gotta write and test a client for 
it, running out of LUNs on the SAN, API also needs to clean up orphaned 
LUNs.
3) large iSCSI LUN with LVM
-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk
Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM 
snapshots & disk cloning, familiar tools (no need to write APIs)
Cons: Volume group corruption if multiple hosts modify the group at the 
same time, or LVM metadata is out of sync between hosts.
4) large iSCSI LUN with CLVM
-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk, use CLVM (clustered LVM) to prevent 
potential issues with VG corruption
Pros: Fast disk creation, easy to delete disk when deleting VM, familiar 
tools (no need to write APIs), safeguard against corruption.
Cons: No snapshot support
5) large iSCSI LUN with LVM, with LVM operations managed by a single host
-- provision a large LUN to the hosts, put LVM on it and create a 
Logical Volume for each VM disk, hand off all LVM operations to a single 
host, or ensure only a single host is running them  at a time.
Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM 
snapshots & disk cloning, familiar tools, prevent possible corruption by 
only running potentially conflicting operations on one host.
Cons: Logic for ensuring LVM operations are handed off to one host or 
otherwise not conflicting needs to be written and baked into provisioning.
6) Provision "system" drives via NFS to VMs, iSCSI LUNs for data
Use NFS for provisioning hosts, only attaching iSCSI LUNs to mount 
points that require the performance. LUNs would be mounted within the 
guest, using iSCSI
Pros: Provisioning as easy as NFS, with qcow2, snapshotting and 
everything still there, only puts the fast storage where it's needed, 
avoids having to modify provisioning.
Cons: LUNs need to be created for data disks, through API or manually, 
multiple guests attaching to iSCSI using software iSCSI will have a 
higher overhead than host attaching, not possible (I think?) to use 
hardware HBA's to speed up iSCSI.
Best,
Ant"
I hope this is helpful.
Phil

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [CentOS-virt] Basic shared storage + KVM