Nick,
What are you planning on running over the shared connection? Database, eMail, File Shares? How many users? How much data? What is your I/O profile?
I've worked with 'enterprise' storage most of my career either as a consumer, adviser or provider - can't comment on AoE other than to suggest you look at what are the business & technical goals, how they solve it and what is your risk profile against the business need of the system. (I'm currently working as an adviser)
When you have an opportunity to chase into "enterprise support" of new disk is usually comes down to insufficient time for testing, or known problems (heat, torque, variability in sample, failure rate & edge-case incompatibility with previously certified products are normal)
You have hinted with the concern over TCP offload that you may have higher-end performance needs and that this system carries a high business value and needs a lower risk solution.
Remember risk is a cost.
regards Dave www.hornfordassociates.com
Nick Bryant Subject: [CentOS] ATA-over-Ethernet v's iSCSI
Just wondering if anyone out there has any opinions or better still experience with Coraid's AoE products in a centos environment?
We're looking at developing a very basic active/standby 2 node cluster which will need a shared storage component. It would be a huge bonus if other servers could also use the storage device. Originally I was looking at the Dell/EMC iSCSI solution as it's a cheaper solution than fibre channel. However, the performance issues without using a TCP offload HBA are a bit of a concern.
Then I found the Coraid (www.coraid.com) products based on the open standard AoE protocol. It's got a number of benefits including: price, less protocol overhead for the server and the ability to use any disks where as "enterprise" approved products form the likes of Dell/Sun etc only support 250gb sata disks at the moment.
I guess my concern is that it's a new technology that's not been widely adopted so far and all the issues that go along with that.
Any options or feedback would be really helpful.
Cheers,
Nick
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Nick Bryant wrote:
Just wondering if anyone out there has any opinions or better still experience with Coraid's AoE products in a centos environment?
I have been bombarded with CORAID marketing in various Linux groups over the last few months. This is because CORAID smartly hit the Linux tradeshow circuit, which is definitely the best way to get free "word of mouth" advertising.
I have seen their products. I have read their technical specifications. And I have come to one conclusion.
AoE is vendor marketing. It is NOT SAN of any sort. It relies 100% on server-wide (e.g., GFS) coherency. Or you merely use it as dedicated storage (even if you slice for different servers).
- The "efficiency" argument
Sure, it has less overhead than iSCSI. That means if you put in a cheap, $50 GbE card or use the on-motherboard GbE NIC, you're going to get better performance.
But anyone who is serious about a SAN puts in a $500 iSCSI HBA GbE, which are very affordable now.
- The "feature" reality
AoE has virtually _no_ features. It's "dumb storage" and that's that. iSCSI is intelligent. How intelligent your target acts is up to you (and the cost). It's multi-vendor and there are complete standards, from host to target -- especially intelligent targets. AoE just a simplistic block interface that relies on 100% software coherency.
We're looking at developing a very basic active/standby 2 node cluster which will need a shared storage component.
AoE is _not_ it then. AoE does _not_ allow shared storage. You must slice the array for each system so they are _independent_. It is _not_ a SAN. It does not have multi-targetting features, only segmented target capability.
They talk GFS. It's GFS in the worst absolute setup -- 100% software, 0% intelligent target. ;->
It would be a huge bonus if other servers could also use the storage device.
Now it can do that. You just slice your array for whatever servers.
Originally I was looking at the Dell/EMC iSCSI solution as it's a cheaper solution than fibre channel.
Have you considered Serial Attached SCSI (SAS)? Most people haven't heard of it, so be sure to read my blog from 3 months back: http://thebs413.blogspot.com/2005/08/serial-storage-is-future.html
12-24Gibps (1.2-2.4GiBps after 10/8 encoding) using a 4 or 8 channel trunk, which is what external target solutions are using. SAS is SCSI-2 over Serial (physically, basically same as SATA, with twisted pair which SATA-IO also requires). It's very scalable and flexable and _damn_fast_. In a nutshell, SAS can use ... - Internal SATA drives - Internal SAS drives - External SAS drives - External SAS enclosures (with SAS drives) - External SAS subsystems (with SATA or SAS drives) - External SAS hubs (intelligent multi-targetting)
So what's the catch of SAS? Same as SCSI: 1. Few people go for the multi-target options 2. Shorter distance (8m ~ 25')
#2 is not an issue if you're providing storage for the closet. That was always the bonus with multi-target, intelligent target, SCSI, before higher-speed FC-AL was available.
#1 is where SAS is still "getting off the ground." It leverages existing SCSI-2, so the multi-targetting is there. But the vendor products are still coming out.
So the verdict is still out on SAS as a SAN solution merely because of the limited products available right now. But it's definitely far more affordable than FC-AL, leverages everything learned with multi-target SCSI-2 and is a heck of a lot better for the closet than iSCSI (where distance isn't a factor).
However, the performance issues without using a TCP offload HBA are a bit of a concern.
If you go iSCSI, you _must_ go with a HBA. Heck, I would argue that you would probably want a HBA for layer-2 Ethernet too, although the layer-3/4 traffic is far worse.
But HBAs start at $500 these days. That's chump change. There are some outstanding iSCSI HBAs under $1,000, so that shouldn't deter you.
An intelligent, multi-targetted subsystem is where the cost is going to be. And that's where multi-target SAS devices, once they become more commonplace, should be significantly cheaper than iSCSI (before figuring disk cost).
Then I found the Coraid (www.coraid.com) products based on the open standard AoE protocol.
It's _their_ standard. And its _empty_. It has _no_ SAN-level code. There is _no_ multi-targeting logic. You have to slice the array -- only 1 connection per end-user volume.
It's got a number of benefits including: price,
So does SAS, and it uses the proven SCSI-2 protocol for multi-targetting.
less protocol overhead for the server
SCSI-2 is better than layer-2 Ethernet, let alone designed for storage. ;->
and the ability to use any disks where as "enterprise" approved products form the likes of Dell/Sun etc only support 250gb sata disks at the moment.
Not true! There are 400 and 500GB near-line 24x7 disks from Seagate and Hitachi, respectively.
[ Sounds like someone fed you marketing. ;-]
In fact, one of Hitachi's big partners is Copan Systems, and they have the 500GB drives in their VTL solution.
I guess my concern is that it's a new technology that's not been widely adopted so far and all the issues that go along with that.
That's the _least_ of your concerns. AoE has _nothing_ in it from a SAN perspective.
I sure wish CORAID was more forthcoming on that.
At the same time, whenever I mention multi-targetting the same volume, it finally shuts up the marketeers. It's all hype, 0 SAN substance.
Dave Hornford OSD@HornfordAssociates.com wrote:
What are you planning on running over the shared connection? Database, eMail, File Shares? How many users? How much data? What is your I/O profile?
Agreed. If you can afford the latency and lower DTR, iSCSI will do. If you need maximum performance, really investigate SAS.
I've worked with 'enterprise' storage most of my career either as a consumer, adviser or provider - can't comment on AoE other than to suggest you look at what are the business & technical goals, how they solve it and what is your risk profile against the business need of the system. (I'm currently working as an adviser)
AoE is _not_ a SAN solution. That 1 statement remove any consideration.
At the same time, I haven't deployed a multi-targetted SAS solution yet. The boards are out, the drives are out, the enclosures are out, and some subsystem products exist. Just because it leverages existing SCSI-2 multi-targetting doesn't mean someone has a well-designed, well-respected, intelligent SAS multi-targettable/sharable solution yet.
When you have an opportunity to chase into "enterprise support" of new disk is usually comes down to insufficient time for testing, or known problems (heat, torque, variability in sample, failure rate & edge-case incompatibility with previously certified products are normal)
AoE is designed for 1 thing -- centralized, segmented storage. It does that well. However, it is _not_ a SAN standard. It does _not_ support multi-targetting of the same volume.
So that means it's no different than if you had local storage using 100% software (e.g., via GFS) to synchronize. It is "dumb".
You have hinted with the concern over TCP offload that you may have higher-end performance needs and that this system carries a high business value and needs a lower risk solution.
Agreed. If you only need storage in the same closet (same 25'), see what intelligent, multi-target SAS solutions are available.
Remember risk is a cost.
Hmmm, I don't know why I haven't formulated that statement before. I talk at length with clients about "risk analysis," yet that simple statement says it all.
[ I'm resending this to the list ]
Nick Bryant wrote:
Just wondering if anyone out there has any opinions or better still experience with Coraid's AoE products in a centos environment?
I have been bombarded with CORAID marketing in various Linux groups over the last few months. This is because CORAID smartly hit the Linux tradeshow circuit, which is definitely the best way to get free "word of mouth" advertising.
I have seen their products. I have read their technical specifications. And I have come to one conclusion.
AoE is vendor marketing. It is NOT SAN of any sort. It relies 100% on server-wide (e.g., GFS) coherency. Or you merely use it as dedicated storage (even if you slice for different servers).
- The "efficiency" argument
Sure, it has less overhead than iSCSI. That means if you put in a cheap, $50 GbE card or use the on-motherboard GbE NIC, you're going to get better performance.
But anyone who is serious about a SAN puts in a $500 iSCSI HBA GbE, which are very affordable now.
- The "feature" reality
AoE has virtually _no_ features. It's "dumb storage" and that's that. iSCSI is intelligent. How intelligent your target acts is up to you (and the cost). It's multi-vendor and there are complete standards, from host to target -- especially intelligent targets. AoE just a simplistic block interface that relies on 100% software coherency.
We're looking at developing a very basic active/standby 2 node cluster which will need a shared storage component.
AoE is _not_ it then. AoE does _not_ allow shared storage. You must slice the array for each system so they are _independent_. It is _not_ a SAN. It does not have multi-targetting features, only segmented target capability.
They talk GFS. It's GFS in the worst absolute setup -- 100% software, 0% intelligent target. ;->
It would be a huge bonus if other servers could also use the storage device.
Now it can do that. You just slice your array for whatever servers.
Originally I was looking at the Dell/EMC iSCSI solution as it's a cheaper solution than fibre channel.
Have you considered Serial Attached SCSI (SAS)? Most people haven't heard of it, so be sure to read my blog from 3 months back: http://thebs413.blogspot.com/2005/08/serial-storage-is-future.html
12-24Gibps (1.2-2.4GiBps after 10/8 encoding) using a 4 or 8 channel trunk, which is what external target solutions are using. SAS is SCSI-2 over Serial (physically, basically same as SATA, with twisted pair which SATA-IO also requires). It's very scalable and flexable and _damn_fast_. In a nutshell, SAS can use ... - Internal SATA drives - Internal SAS drives - External SAS drives - External SAS enclosures (with SAS drives) - External SAS subsystems (with SATA or SAS drives) - External SAS hubs (intelligent multi-targetting)
So what's the catch of SAS? Same as SCSI: 1. Few people go for the multi-target options 2. Shorter distance (8m ~ 25')
#2 is not an issue if you're providing storage for the closet. That was always the bonus with multi-target, intelligent target, SCSI, before higher-speed FC-AL was available.
#1 is where SAS is still "getting off the ground." It leverages existing SCSI-2, so the multi-targetting is there. But the vendor products are still coming out.
So the verdict is still out on SAS as a SAN solution merely because of the limited products available right now. But it's definitely far more affordable than FC-AL, leverages everything learned with multi-target SCSI-2 and is a heck of a lot better for the closet than iSCSI (where distance isn't a factor).
However, the performance issues without using a TCP offload HBA are a bit of a concern.
If you go iSCSI, you _must_ go with a HBA. Heck, I would argue that you would probably want a HBA for layer-2 Ethernet too, although the layer-3/4 traffic is far worse.
But HBAs start at $500 these days. That's chump change. There are some outstanding iSCSI HBAs under $1,000, so that shouldn't deter you.
An intelligent, multi-targetted subsystem is where the cost is going to be. And that's where multi-target SAS devices, once they become more commonplace, should be significantly cheaper than iSCSI (before figuring disk cost).
Then I found the Coraid (www.coraid.com) products based on the open standard AoE protocol.
It's _their_ standard. And its _empty_. It has _no_ SAN-level code. There is _no_ multi-targeting logic. You have to slice the array -- only 1 connection per end-user volume.
It's got a number of benefits including: price,
So does SAS, and it uses the proven SCSI-2 protocol for multi-targetting.
less protocol overhead for the server
SCSI-2 is better than layer-2 Ethernet, let alone designed for storage. ;->
and the ability to use any disks where as "enterprise" approved products form the likes of Dell/Sun etc only support 250gb sata disks at the moment.
Not true! There are 400 and 500GB near-line 24x7 disks from Seagate and Hitachi, respectively.
[ Sounds like someone fed you marketing. ;-]
In fact, one of Hitachi's big partners is Copan Systems, and they have the 500GB drives in their VTL solution.
I guess my concern is that it's a new technology that's not been widely adopted so far and all the issues that go along with that.
That's the _least_ of your concerns. AoE has _nothing_ in it from a SAN perspective.
I sure wish CORAID was more forthcoming on that.
At the same time, whenever I mention multi-targetting the same volume, it finally shuts up the marketeers. It's all hype, 0 SAN substance.
Dave Hornford OSD@HornfordAssociates.com wrote:
What are you planning on running over the shared connection? Database, eMail, File Shares? How many users? How much data? What is your I/O profile?
Agreed. If you can afford the latency and lower DTR, iSCSI will do. If you need maximum performance, really investigate SAS.
I've worked with 'enterprise' storage most of my career either as a consumer, adviser or provider - can't comment on AoE other than to suggest you look at what are the business & technical goals, how they solve it and what is your risk profile against the business need of the system. (I'm currently working as an adviser)
AoE is _not_ a SAN solution. That 1 statement remove any consideration.
At the same time, I haven't deployed a multi-targetted SAS solution yet. The boards are out, the drives are out, the enclosures are out, and some subsystem products exist. Just because it leverages existing SCSI-2 multi-targetting doesn't mean someone has a well-designed, well-respected, intelligent SAS multi-targettable/sharable solution yet.
When you have an opportunity to chase into "enterprise support" of new disk is usually comes down to insufficient time for testing, or known problems (heat, torque, variability in sample, failure rate & edge-case incompatibility with previously certified products are normal)
AoE is designed for 1 thing -- centralized, segmented storage. It does that well. However, it is _not_ a SAN standard. It does _not_ support multi-targetting of the same volume.
So that means it's no different than if you had local storage using 100% software (e.g., via GFS) to synchronize. It is "dumb".
You have hinted with the concern over TCP offload that you may have higher-end performance needs and that this system carries a high business value and needs a lower risk solution.
Agreed. If you only need storage in the same closet (same 25'), see what intelligent, multi-target SAS solutions are available.
Remember risk is a cost.
Hmmm, I don't know why I haven't formulated that statement before. I talk at length with clients about "risk analysis," yet that simple statement says it all.
[ I'm resending this to the list ]
Nick Bryant wrote:
Just wondering if anyone out there has any opinions or better still experience with Coraid's AoE products in a centos environment?
I have been bombarded with CORAID marketing in various Linux groups over the last few months. This is because CORAID smartly hit the Linux tradeshow circuit, which is definitely the best way to get free "word of mouth" advertising.
Indeed - that's how I found out about them.
I have seen their products. I have read their technical specifications. And I have come to one conclusion.
AoE is vendor marketing. It is NOT SAN of any sort. It relies 100% on server-wide (e.g., GFS) coherency. Or you merely use it as dedicated storage (even if you slice for different servers).
- The "efficiency" argument
Sure, it has less overhead than iSCSI. That means if you put in a cheap, $50 GbE card or use the on-motherboard GbE NIC, you're going to get better performance.
But anyone who is serious about a SAN puts in a $500 iSCSI HBA GbE, which are very affordable now.
It's not *just* the cost of the HBA though - the storage device itself is quite a bit more expensive.
- The "feature" reality
AoE has virtually _no_ features. It's "dumb storage" and that's that. iSCSI is intelligent. How intelligent your target acts is up to you (and the cost). It's multi-vendor and there are complete standards, from host to target -- especially intelligent targets. AoE just a simplistic block interface that relies on 100% software coherency.
We're looking at developing a very basic active/standby 2 node cluster which will need a shared storage component.
AoE is _not_ it then. AoE does _not_ allow shared storage. You must slice the array for each system so they are _independent_. It is _not_ a SAN. It does not have multi-targetting features, only segmented target capability.
Ok forgive me for my knowledge of SANs isn't great, but I thought if you were using a SAN that represents itself as a block device (a real SAN) that only one machine could have true read/write access to the "slice"? That was unless you used a file system like GFS (I wasn't intending too). When I said shared storage I didn't mean it had to be accessed at the same time from all hosts. The RHEL cluster suite in an active/standby setup actually mounts the partitions as a host changes from standby to active after its sure the active host hasn't got access anymore with a "lights out" OoB setup.
Well that was my understanding of how it worked anyhow?
They talk GFS. It's GFS in the worst absolute setup -- 100% software, 0% intelligent target. ;->
It would be a huge bonus if other servers could also use the storage device.
Now it can do that. You just slice your array for whatever servers.
Originally I was looking at the Dell/EMC iSCSI solution as it's a cheaper solution than fibre channel.
Have you considered Serial Attached SCSI (SAS)? Most people haven't heard of it, so be sure to read my blog from 3 months back: http://thebs413.blogspot.com/2005/08/serial-storage-is-future.html
I have not but I'll be sure to check it out now... my worry is that out of the vendors I'm talking to (Acer, Dell, HP and Sun) no one offered it up.
12-24Gibps (1.2-2.4GiBps after 10/8 encoding) using a 4 or 8 channel trunk, which is what external target solutions are using. SAS is SCSI-2 over Serial (physically, basically same as SATA, with twisted pair which SATA-IO also requires). It's very scalable and flexable and _damn_fast_. In a nutshell, SAS can use ...
- Internal SATA drives
- Internal SAS drives
- External SAS drives
- External SAS enclosures (with SAS drives)
- External SAS subsystems (with SATA or SAS drives)
- External SAS hubs (intelligent multi-targetting)
So what's the catch of SAS? Same as SCSI:
- Few people go for the multi-target options
- Shorter distance (8m ~ 25')
The distance does limit the flexibility but can be worked around.
#2 is not an issue if you're providing storage for the closet. That was always the bonus with multi-target, intelligent target, SCSI, before higher-speed FC-AL was available.
#1 is where SAS is still "getting off the ground." It leverages existing SCSI-2, so the multi-targetting is there. But the vendor products are still coming out.
So the verdict is still out on SAS as a SAN solution merely because of the limited products available right now. But it's definitely far more affordable than FC-AL, leverages everything learned with multi-target SCSI-2 and is a heck of a lot better for the closet than iSCSI (where distance isn't a factor).
Definitely one to look at in the future then.
However, the performance issues without using a TCP offload HBA are a bit of a concern.
If you go iSCSI, you _must_ go with a HBA. Heck, I would argue that you would probably want a HBA for layer-2 Ethernet too, although the layer-3/4 traffic is far worse.
Good to know.
But HBAs start at $500 these days. That's chump change. There are some outstanding iSCSI HBAs under $1,000, so that shouldn't deter you.
That's the problem, this "chump" is paying for it himself... sadly I know longer have bank/telco budgets to play with :( But still 500USD isn't really bad.
An intelligent, multi-targetted subsystem is where the cost is going to be. And that's where multi-target SAS devices, once they become more commonplace, should be significantly cheaper than iSCSI (before figuring disk cost).
Then I found the Coraid (www.coraid.com) products based on the open standard AoE protocol.
It's _their_ standard. And its _empty_. It has _no_ SAN-level code. There is _no_ multi-targeting logic. You have to slice the array -- only 1 connection per end-user volume.
I know they made the standard but it was my understanding it was open now? Whether or not other vendors will use it remains to be seen though.
It's got a number of benefits including: price,
So does SAS, and it uses the proven SCSI-2 protocol for multi-targetting.
less protocol overhead for the server
SCSI-2 is better than layer-2 Ethernet, let alone designed for storage. ;->
and the ability to use any disks where as "enterprise" approved products form the likes of Dell/Sun etc only support 250gb sata disks at the moment.
Not true! There are 400 and 500GB near-line 24x7 disks from Seagate and Hitachi, respectively.
[ Sounds like someone fed you marketing. ;-]
I'm aware they exist but go and try and product from dell with anything larger than a 250gb sata disk in it. Good luck ;) If you ask you'll be told that the larger disks haven't yet been approved in the enterprise type systems yet, but I imagine part of it will be they don't want to cannibalise part of their SCSI market by offering products with a *much* loser cost per GB, well not yet anyhow.
In fact, one of Hitachi's big partners is Copan Systems, and they have the 500GB drives in their VTL solution.
I guess my concern is that it's a new technology that's not been widely adopted so far and all the issues that go along with that.
That's the _least_ of your concerns. AoE has _nothing_ in it from a SAN perspective.
I sure wish CORAID was more forthcoming on that.
At the same time, whenever I mention multi-targetting the same volume, it finally shuts up the marketeers. It's all hype, 0 SAN substance.
Good to know - thanks.
Dave Hornford OSD@HornfordAssociates.com wrote:
What are you planning on running over the shared connection? Database, eMail, File Shares? How many users? How much data? What is your I/O profile?
Mainly file shares with some imaging and D2D server backup. We have about 70000 users. Upto 2tb right now... but the abilty to grow it in the future would be great. The I/O profile is low in terms of throughput and the data isn't massively urgent. However, the CPUs on the servers are more important and I don't want to load them too much in the process (hence my fear of a non iSCSI HBA).
Agreed. If you can afford the latency and lower DTR, iSCSI will do. If you need maximum performance, really investigate SAS.
I've worked with 'enterprise' storage most of my career either as a consumer, adviser or provider - can't comment on AoE other than to suggest you look at what are the business & technical goals, how they solve it and what is your risk profile against the business need of the system. (I'm currently working as an adviser)
AoE is _not_ a SAN solution. That 1 statement remove any consideration.
I'm hearing that :) glad I asked.
At the same time, I haven't deployed a multi-targetted SAS solution yet. The boards are out, the drives are out, the enclosures are out, and some subsystem products exist. Just because it leverages existing SCSI-2 multi-targetting doesn't mean someone has a well-designed, well-respected, intelligent SAS multi-targettable/sharable solution yet.
When you have an opportunity to chase into "enterprise support" of new disk is usually comes down to insufficient time for testing, or known problems (heat, torque, variability in sample, failure rate & edge-case incompatibility with previously certified products are normal)
AoE is designed for 1 thing -- centralized, segmented storage. It does that well. However, it is _not_ a SAN standard. It does _not_ support multi-targetting of the same volume.
Again excuse my ignorance, does multi-targeting mean that two systems can share a volume (r/w) without the use of GFS? According to redhat cluster papers this isn't possible.
So that means it's no different than if you had local storage using 100% software (e.g., via GFS) to synchronize. It is "dumb".
You have hinted with the concern over TCP offload that you may have higher-end performance needs and that this system carries a high business value and needs a lower risk solution.
Indeed the system has a high business value. The servers that the system runs on have a high business value (hence the cluster). However, performance (in terms of I/O) isn't a massive issue. I was more concerned about that the performance impact to the servers was.
Agreed. If you only need storage in the same closet (same 25'), see what intelligent, multi-target SAS solutions are available.
Remember risk is a cost.
Yes it's just a very easy one to not spend :) Especially when its your own dollars.
Many thanks for the feedback.
Nick
On Tue, 2005-11-08 at 15:16 +1100, Nick Bryant wrote:
Indeed - that's how I found out about them.
Indeed - that's how everyone is. ;->
It's not *just* the cost of the HBA though - the storage device itself is quite a bit more expensive.
Exactly! The HBA is the _least_ of your concerns.
It's a multi-target intelligence that allows you to address the same space from 2 different hosts. CORAID doesn't allow that. Some cheaper SCSI, SAS and iSCSI don't either. You get what you pay for, and if you want decent performance thanx to hardware multi-targetting of the same storage from multiple devices, you need such.
Ok forgive me for my knowledge of SANs isn't great, but I thought if you were using a SAN that represents itself as a block device (a real SAN) that only one machine could have true read/write access to the "slice"?
Yes and no. Yes, you can have a SAN that does such. But no, you can have SANs that handle targetting of the same storage area. This required _both_ intelligence on the target SCSI, SAS, iSCSI or FC/FC-AL device, as well as software on the hosts that are aware of it.
That was unless you used a file system like GFS (I wasn't intending too).
GFS in 100% software-host controlled mode can synchronize non-unified storage. It's slow and pitiful. I've only tried GFS in this mode -- although I think it can use shared/unified space between two hosts (never tried it myself).
You do not need GFS to use shared/unified space between two hosts. You merely need a way for those hosts to access and share the unified space in a way that is coherent -- i.e., one mounts read/write, while the other mounts read and the changes by the other is synchronized. Red Hat has been doing this since well before GFS -- sharing out NFS and Samba in a fail-over between two hosts to the same space. That was the work that Red Hat gained from their Mission Critical Linux acquisition.
That's what I've used in the past for my clustered file services.
When I said shared storage I didn't mean it had to be accessed at the same time from all hosts. The RHEL cluster suite in an active/standby setup actually mounts the partitions as a host changes from standby to active after its sure the active host hasn't got access anymore with a "lights out" OoB setup. Well that was my understanding of how it worked anyhow?
Yes, but you're missing a key point. The system designated for failover is still mounting the volume -- even if by standby. Think of it as a "read-only" mount that tracks changes done by the other system with the "read/write" (I know this is a mega-oversimplification). And when it does fail-over, it still has to be allowed to mount it when the other system may be in a state that the target device believes is still accessing it.
CORAID will _refuse_ to allow anything to access to volume after one system mounts it. It is not multi-targettable. SCSI-2, iSCSI and FC/FC-AL are. AoE is not.
I have not but I'll be sure to check it out now... my worry is that out of the vendors I'm talking to (Acer, Dell, HP and Sun) no one offered it up.
Ask where their SAS capabilities are at. In a few cases, some FC-AL product lines are also adding SAS. But since SAS is still young in the multi-targetted area (even though it leverages existing SCSI-2), it wouldn't surprise me if few products are out -- let alone companies are not offering them up.
The distance does limit the flexibility but can be worked around.
The idea behind multi-targeted SAS is "storage in the same closet." If that limitation is an issue, then you don't want SAS.
Definitely one to look at in the future then.
Yep, SAS is one of the most unknown technologies right now. People either think it's some SATA technology that's not useful for the enterprise, or they think SCSI is dead. Quite the opposite, it's superior to SATA, while being compatible with it, and offers the full SCSI-2 command set. SAS drives roll off the same lines as U320 SCSI drives at most HD fabs.
That's the problem, this "chump" is paying for it himself... sadly I know longer have bank/telco budgets to play with :( But still 500USD isn't really bad.
Well, you're probably better off, price-wise, with internal storage and using 100% GFS.
I know they made the standard but it was my understanding it was open now?
It's always been open. It's their standard. And it's rather limited.
Whether or not other vendors will use it remains to be seen though.
Doesn't matter. I look at the features of the protocol. It doesn't support multi-targeting.
I'm aware they exist but go and try and product from dell with anything larger than a 250gb sata disk in it. Good luck ;)
Hitachi and Seagate started selling their 500GB and 400GB, respectively, 24x7 rated disks just a few months ago. As far as Dell, they are often working with Maxtor, who doesn't sell many 24x7 rated disks. In fact, I wouldn't be surprised if Dell is their only major partner.
If you ask you'll be told that the larger disks haven't yet been approved in the enterprise type systems yet,
Again, Dell is using Maxtor. Maxtor's focus has always been on commodity price. They don't have a heavy interest in 24x7 versions for a premium.
Hitachi and Seagate do.
but I imagine part of it will be they don't want to cannibalise part of their SCSI market by offering products with a *much* loser cost per GB, well not yet anyhow.
Has nothing to do with it. Even 24x7 commodity drives are still not as reliable as enterprise drives.
In case you didn't know, enterprise drives with low-vibration and high- tolerances come in capacities of 18, 36, 73 and 146GB. Commodity drives that have vibrations that are 3-10x worse, and far lower tolerances, come in capacities up to 500GB.
The 24x7 commodity drives are either those drives that test to higher tolerances, or are manufactured with improved components, but still not the same as enterprise capacities/reliability.
For more on Enterprise v. Commodity v. 24x7 Commodity drives, see: http://www.samag.com/documents/s=9841/sam0509a/0509a_s1.htm http://www.samag.com/documents/s=9841/sam0509a/0509a_t2.htm
On Nov 8, 2005, at 5:19 AM, Bryan J. Smith wrote:
On Tue, 2005-11-08 at 15:16 +1100, Nick Bryant wrote:
When I said shared storage I didn't mean it had to be accessed at the same time from all hosts. The RHEL cluster suite in an active/standby setup actually mounts the partitions as a host changes from standby to active after its sure the active host hasn't got access anymore with a "lights out" OoB setup. Well that was my understanding of how it worked anyhow?
Yes, but you're missing a key point. The system designated for failover is still mounting the volume -- even if by standby. Think of it as a "read-only" mount that tracks changes done by the other system with the "read/write" (I know this is a mega-oversimplification). And when it does fail-over, it still has to be allowed to mount it when the other system may be in a state that the target device believes is still accessing it.
Just one additional point. RH cluster suite requires (at least it did in RHEL3 clustering) two shared raw partitions to store state information and (I believe) some heartbeating.
This blows AoE out for anything with the cluster suite.
RedHat Clustering (again in 3), does not mount both volumes simultaneously for normal services however. When a node fails over, it is forcibly umounted on one system and remounted on the other. This means you can use ext3 with cluster suite. But the point above leads you to need a block level device.
Tarun
Tarun Reddy treddy@rallydev.com wrote:
This blows AoE out for anything with the cluster suite.
All AoE implementations I've seen are relying on GFS and a lot of inter-host negotiation. But there are still fail-over concerns.
No one is using direct Oracle, DFS and other applications directly. They are always using GFS between them for AoE. It's not the most ideal setup.
There's a massive difference between using GFS and not. ;->
On Nov 8, 2005, at 5:19 AM, Bryan J. Smith wrote:
On Tue, 2005-11-08 at 15:16 +1100, Nick Bryant wrote:
When I said shared storage I didn't mean it had to be accessed at the same time from all hosts. The RHEL cluster suite in an active/standby setup actually mounts the partitions as a host changes from standby to active after its sure the active host hasn't got access anymore with a "lights out" OoB setup. Well that was my understanding of how it worked anyhow?
Yes, but you're missing a key point. The system designated for failover is still mounting the volume -- even if by standby. Think of it as a "read-only" mount that tracks changes done by the other system with the "read/write" (I know this is a mega-oversimplification). And when it does fail-over, it still has to be allowed to mount it when the other system may be in a state that the target device believes is still accessing it.
Just one additional point. RH cluster suite requires (at least it did in RHEL3 clustering) two shared raw partitions to store state information and (I believe) some heartbeating.
This blows AoE out for anything with the cluster suite.
RedHat Clustering (again in 3), does not mount both volumes simultaneously for normal services however. When a node fails over, it is forcibly umounted on one system and remounted on the other. This means you can use ext3 with cluster suite. But the point above leads you to need a block level device.
Of course, the quorum partition. Oh well that answers that one....
It looks like a low cost angle isn't going to happen for now. Bal*s.
Thanks for all the feedback.
Quoting Nick Bryant list@everywhereinternet.com:
AoE is _not_ it then. AoE does _not_ allow shared storage. You must slice the array for each system so they are _independent_. It is _not_ a SAN. It does not have multi-targetting features, only segmented target capability.
Ok forgive me for my knowledge of SANs isn't great, but I thought if you were using a SAN that represents itself as a block device (a real SAN) that only one machine could have true read/write access to the "slice"? That was unless you used a file system like GFS (I wasn't intending too). When I said shared storage I didn't mean it had to be accessed at the same time from all hosts. The RHEL cluster suite in an active/standby setup actually mounts the partitions as a host changes from standby to active after its sure the active host hasn't got access anymore with a "lights out" OoB setup.
Well that was my understanding of how it worked anyhow?
Exactly, that was what got myself confused too. SAN doesn't provide "safe" concurent access to device by itself, you need to have cluster-aware file system running on top of it. With SAN, one would always configure zones (on the switch) and/or LUN masking (on storage device) to prevent clients fighting for the storage and corrupting data.
NAS offers safe concurent access (generally, there might be some NAS devices outthere that do not). NAS device will manage file system internally, and export it over NFS or SMB protocols to the clients. It's going to be slower and less efficient than SAN device though (because of the upper protocol overhead), and the set of features offered by file system might not be what would be available if file system was managed by client's operating system itself.
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
Aleksandar Milivojevic alex@milivojevic.org wrote:
Exactly, that was what got myself confused too. SAN doesn't provide "safe" concurent access to device by itself,
Agreed. But it does provide the ability for multiple hosts to target the same space, and handle _some_ that coherency on the storage end. That doesn't replace what needs to go on the host-end, but it can work in conjunction with it.
According to those I spoke to at CORAID, you could not have 2 systems accessing the same space. If you try to access space while it believes another is accessing it (such as a failed node), it won't work.
If this has now changed, please let me know. But the last time I discussed this, they did not implement certain features that you will find in multi-targettable SCSI, iSCSI and FC solutions.
They expect 100% host-side resolution of everything. E.g., there is a reason why SCSI-2 (SCSI/SAS), TCP (iSCSI), etc... is "safer" than Ethernet -- there are acknowledgements. With CORAID's solution, the hosts have to do extra checking to confirm buffers have been written, etc..., and it's not exactly fool-proof -- unlike SCSI-2, TCP, etc...
AoE does not address many of these issues from what I read just a few months ago. Things that SCSI-2 and TCP do!
you need to have cluster-aware file system running on top of it.
Of course. I never argued otherwise. I merely stated that the more you can address at the target, the less the host and the more efficient, higher-performance and "safer" the clustering can be.
From all the _lack_ of features in AoE, it doesn't leave me
with a warm'n fuzzy feeling. Every single rep I spoke to basically said consider AoE little better than Oracle's FireWire hack. They recommended I _never_ have 2 system use the same area, not even in a cluster setup, if I wanted SCSI/iSCSI/FC like switchover.
With SAN, one would always configure zones (on the switch) and/or LUN masking (on storage device) to prevent clients fighting for the storage and corrupting data.
And the CORAID does that too.
But at the same time, most multi-targettable SCSI/SAN solutions define various functions to ensure acknowledgement of buffer commits to disk, watchdog services to check if a node is no longer accessing the area (freeing up the lock so the failover system can mount read/write), etc... Others offer multiple read mounts to the same area from multiple systems, etc...
There is just a _darth_ of features in CORAID's protocol versus SCSI-2, TCP, etc... IMHO. Those drastically affect the ability to do "well designed clustering/fail-over" IMHO. If you press the CORAID people on them, they'll admit areas where they are deficient as a storage solution for a fail-over cluster.
As I said before, it's almost as bad as using Oracle's FireWire hack. It isn't anything like a typical SAN designed for fail-over as a target from multiple-hosts.
NAS offers safe concurent access (generally, there might be some NAS devices outthere that do not). NAS device will manage file system internally, and export it over NFS or SMB protocols to the clients.
Such NAS' are a combined host+storage aka "filer." They have many advantages over SAN -- especially in their fail-over and/or load-balancing capabilities.
It's going to be slower and less efficient than SAN device though (because of the upper protocol overhead),
Oh, it all depends on the design of the NAS. NetApp does a pretty damn fine job with their designs (long story).
and the set of features offered by file system might not be what would be available if file system was managed by client's operating system itself.
But there's many other benefits. But that is a larger discussion.
All I wanted people to know is that AoE doesn't have a lot of features you'll find in SCSI-2, TCP, etc... when it comes to using it as a fail-over storage solution. I would highly recommend you not use it as such.
On Tue, 2005-11-08 at 09:49, Aleksandar Milivojevic wrote:
NAS offers safe concurent access (generally, there might be some NAS devices outthere that do not). NAS device will manage file system internally, and export it over NFS or SMB protocols to the clients. It's going to be slower and less efficient than SAN device though (because of the upper protocol overhead), and the set of features offered by file system might not be what would be available if file system was managed by client's operating system itself.
Or, in the case of a smart NAS and a dumb client you might have better features like frozen snapshots and remote mirroring - and without worrying about client software issues corrupting the filesystem.