On 06/21/2012 12:13 PM, Dennis Jacobfeuerborn wrote:
AFAIK you cannot use Swift storage as a Nova volume backend. Also in order to make Swift scale you need at least a couple of nodes.
Is this true? I haven't had a chance to dig into this, but I asked my OpenStack guy about this on IRC the other day:
14:48 pdurbin westmaas: is this true? "AFAIK you cannot use Swift storage as a Nova volume backend" -- [CentOS-virt] Basic shared storage + KVM - http://lists.centos.org/pipermail/centos-virt/2012-June/002943.html
14:51 westmaas pdurbin: hm, I'm not 100% sure on that. let me ask around.
14:52 pdurbin westmaas: thanks. i thought the point of swift was that it would take away all my storage problems. :) that swift would handle all the scaling for me
14:54 westmaas all your object storage
14:54 westmaas not necessarily block storage
14:54 westmaas but at the same time, I can't imagine this not being a goal
14:55 pdurbin well, i thought the vm images were abstracted away into objects or whatever
14:55 pdurbin i need to do some reading, obviously
14:55 westmaas yeah, the projects aren't tied that closely together yet.
14:56 pdurbin bummer
14:56 pdurbin agoddard had a great internal reply to that centos-virt thread. about iSCSI options
14:57 pdurbin i don't see him online but i'll have to ask if he minds if i copy and paste his reply back to the list
-- http://irclog.perlgeek.de/crimsonfu/2012-06-25#i_5756369
It looks like I need to dig into this documentation:
Storage: objects, blocks, and files - OpenStack Install and Deploy Manual - Essex - http://docs.openstack.org/essex/openstack-compute/install/yum/content/termin...
If there's other stuff I should be reading, please send me links!
I'm off to the Red Hat Summit the rest of the week and I'll try to ask the OpenStack guys about this.
You might want to take a look at ceph.com The offer an object store that can be attached as a block device (like iScsi) but KVM also contains a driver that can directly talk to the storage. Then there is CephFS which is basically a posix filesystem on top of the object store that has some neat features and would be a closer replacement to NFS.
Another thing to look at is http://www.osrg.net/sheepdog/ This is very similar to ceph's object storage approach. Some large scale benchmarks (1000 nodes) can be found here: http://sheepdog.taobao.org/people/zituan/sheepdog1k.html
Then there is http://www.gluster.org/ This is probably the most mature solution but I'm not sure if the architecture will be able to compete against the other solutions in the long run.
These are all good ideas and I need to spend more time reading about them. Thanks.
The main reason I'm writing is that "agoddard" from above gave me permission to copy and paste his thoughts on iSCSI and libvirt. (He isn't subscribed to this mailing list, but I had forwarded what I wrote.) Here is his reply:
"From my understanding, these are the options for iSCSI.. I'd love to hear about it if anyone has thoughts or alternatives :)
1) iSCSI 1 LUN per volume manually
-- provision a LUN manually for a host on the SAN, attach the LUN to libvirt and rock.
Pros: fast storage, reliable, multipathing, live migration should work
Cons: manually configuring the LUN when you deploy the VM (and timing this right with automated tasks that are expecting a disk), running out of LUNs on the SAN, cleaning up orphaned LUNs, etc etc.
2) iSCSI 1 LUN per volume using API
-- provision a LUN for a host on the SAN, using an API to the SAN to orchestrate LUN creation during VM creation, attach the LUN to libvirt and rock.
Pros: fast storage, reliable, multipathing, live migration should work
Cons: the SAN has to have an API, you gotta write and test a client for it, running out of LUNs on the SAN, API also needs to clean up orphaned LUNs.
3) large iSCSI LUN with LVM
-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk
Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM snapshots & disk cloning, familiar tools (no need to write APIs)
Cons: Volume group corruption if multiple hosts modify the group at the same time, or LVM metadata is out of sync between hosts.
4) large iSCSI LUN with CLVM
-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk, use CLVM (clustered LVM) to prevent potential issues with VG corruption
Pros: Fast disk creation, easy to delete disk when deleting VM, familiar tools (no need to write APIs), safeguard against corruption.
Cons: No snapshot support
5) large iSCSI LUN with LVM, with LVM operations managed by a single host
-- provision a large LUN to the hosts, put LVM on it and create a Logical Volume for each VM disk, hand off all LVM operations to a single host, or ensure only a single host is running them at a time.
Pros: Fast disk creation, easy to delete disk when deleting VM, fast LVM snapshots & disk cloning, familiar tools, prevent possible corruption by only running potentially conflicting operations on one host.
Cons: Logic for ensuring LVM operations are handed off to one host or otherwise not conflicting needs to be written and baked into provisioning.
6) Provision "system" drives via NFS to VMs, iSCSI LUNs for data
Use NFS for provisioning hosts, only attaching iSCSI LUNs to mount points that require the performance. LUNs would be mounted within the guest, using iSCSI
Pros: Provisioning as easy as NFS, with qcow2, snapshotting and everything still there, only puts the fast storage where it's needed, avoids having to modify provisioning.
Cons: LUNs need to be created for data disks, through API or manually, multiple guests attaching to iSCSI using software iSCSI will have a higher overhead than host attaching, not possible (I think?) to use hardware HBA's to speed up iSCSI.
Best, Ant"
I hope this is helpful.
Phil