Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Thanks.
Boris.
On Wed, Jan 6, 2010 at 4:38 PM, Joseph L. Casale jcasale@activenetwerx.com wrote:
recommendations as far as hardware?
Giving we have no clue what it is used for no:) Seriously, it makes all the difference what this is backing, vm's exported over nfs/iSCSI, samba, etc... _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Joseph, this is a good point, thanks.
Primarily we are talking NFS/SSH/SFTP. Also possibly Samba, CIFS.
Like I said, a storage platform, nothing fancy.
Boris.
Joseph, this is a good point, thanks.
Primarily we are talking NFS/SSH/SFTP. Also possibly Samba, CIFS.
Like I said, a storage platform, nothing fancy.
While I have built many, I can tell you that buying a turnkey solution is *always* worth it if its mission critical. When you piece something together, you are always in for potential surprises and other caveats.
Like the rest have stated, I can only say that you haven't mentioned how busy it will be. That might dictate which type of sata drives or sas for example.
Whatever you do, get a good controller with nvram. I have a lot of LSI and while they are rock solid, the gui is a piece of sh!t and full of bugs. I always use the cli unless not possible.
Frankly, I always hated Adaptec, but I have seen mention of them making some sas controllers that performed better that lsi? Take that w/ a grain of salt, but next controller I buy, I am shopping around...
jlc
On Wed, 6 Jan 2010, Joseph L. Casale wrote:
While I have built many, I can tell you that buying a turnkey solution is *always* worth it if its mission critical. When you piece something together, you are always in for potential surprises and other caveats.
Like the rest have stated, I can only say that you haven't mentioned how busy it will be. That might dictate which type of sata drives or sas for example.
Whatever you do, get a good controller with nvram. I have a lot of LSI and while they are rock solid, the gui is a piece of sh!t and full of bugs. I always use the cli unless not possible.
Frankly, I always hated Adaptec, but I have seen mention of them making some sas controllers that performed better that lsi? Take that w/ a grain of salt, but next controller I buy, I am shopping around...
We use Dell 2950s with an MD3000 or MD1000 depending on OS. For our Solaris NAS node we use the MD1000 for CentOS we use the MD3000 because of the hardware RAID controller. Gives us 15TB of RAW disk space with 1TB drives plus room for 4-5 additional drives (based on configuration) in the chassis.
On Wed, Jan 6, 2010 at 10:50 PM, James A. Peltier jpeltier@fas.sfu.ca wrote:
We use Dell 2950s with an MD3000 or MD1000 depending on OS. For our Solaris NAS node we use the MD1000 for CentOS we use the MD3000 because of the hardware RAID controller. Gives us 15TB of RAW disk space with 1TB drives plus room for 4-5 additional drives (based on configuration) in the chassis.
Seconded. Also you can chain a couple of MD1000s at the back of the MD3000 to get even more storage over SAS. We have a number of those with the upstream OS installed but usually a single MD3000 is enough for what we use them for (mainly Oracle DB server or VMWare hosts). We tend to split the storage between two nodes and then do OCFS2.
On Wed, 6 Jan 2010, Hakan Koseoglu wrote:
Seconded. Also you can chain a couple of MD1000s at the back of the MD3000 to get even more storage over SAS. We have a number of those with the upstream OS installed but usually a single MD3000 is enough for what we use them for (mainly Oracle DB server or VMWare hosts). We tend to split the storage between two nodes and then do OCFS2.
For the cost/performance they're not too bad a unit. We grow by about 45TB per year of Medical Imaging Data. For each 15TB we buy a new head node, we're up to three now, so performance to our cluster just gets better as we go. These are all NFS/CIFS servers on a Jumbo Frame ethernet network. I originally had difficulty with the MD3000 talking multipath to the units and the only way I could get it to work reliably in an active/active configuration was to use the provided mptsas driver which was a cinch to install.
This was not the case with the Solaris hosts as they didn't talk RAID and ZFS with Solaris Multi-Pathing had built in support for the devices.
You can have a look at this, I don't know what your budget is like
http://www.drobo.com/Products/drobopro/index.php
I have a drobo and it worked off the bat with a few linux distros
On Wed, Jan 6, 2010 at 7:15 PM, James A. Peltier jpeltier@fas.sfu.cawrote:
On Wed, 6 Jan 2010, Hakan Koseoglu wrote:
Seconded. Also you can chain a couple of MD1000s at the back of the MD3000 to get even more storage over SAS. We have a number of those with the upstream OS installed but usually a single MD3000 is enough for what we use them for (mainly Oracle DB server or VMWare hosts). We tend to split the storage between two nodes and then do OCFS2.
For the cost/performance they're not too bad a unit. We grow by about 45TB per year of Medical Imaging Data. For each 15TB we buy a new head node, we're up to three now, so performance to our cluster just gets better as we go. These are all NFS/CIFS servers on a Jumbo Frame ethernet network. I originally had difficulty with the MD3000 talking multipath to the units and the only way I could get it to work reliably in an active/active configuration was to use the provided mptsas driver which was a cinch to install.
This was not the case with the Solaris hosts as they didn't talk RAID and ZFS with Solaris Multi-Pathing had built in support for the devices.
-- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director HPC Coordinator Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier@sfu.ca Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca http://blogs.sfu.ca/people/jpeltier MSN : subatomic_spam@hotmail.com
Treat your password like your toothbrush. Don't let anybody else use it, and get a new one every six months.
- Clifford Stoll
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 01/07/2010 03:28 AM, earl ramirez wrote:
You can have a look at this, I don't know what your budget is like
http://www.drobo.com/Products/drobopro/index.php
I have a drobo and it worked off the bat with a few linux distros
I've had 2 drobo's at work - and i can assure you that it is essentially a wasted device. IF you want to go down that route, get a proper machine. eg. the drobo never managed to get over 60% of capacity performance, and has massive latency issues when you cross 50k files on the storage.
- KB
2010/1/7 Karanbir Singh mail-lists@karan.org:
I've had 2 drobo's at work - and i can assure you that it is essentially a wasted device.
I agree with this. We had a Drobo on loan for a while, I found it sluggish and detested the way it over-reports its free space.
Couldn't wait to hand it back.
Ben
Joseph L. Casale wrote:
recommendations as far as hardware?
Giving we have no clue what it is used for no:) Seriously, it makes all the difference what this is backing, vm's exported over nfs/iSCSI, samba, etc...
Very good point there.
If you're looking for something like an all-in-one kind of thing. I built a server here with a 3Ware drive cage in it. We use it for our backup server.
However, I've not kept up to date with what 3Ware offers now, so I'm not sure about space upwards to 15TB. Check them out though.
Regards, Max
On Wed, Jan 06, 2010 at 04:42:39PM -0500, Max Hetrick wrote:
If you're looking for something like an all-in-one kind of thing. I built a server here with a 3Ware drive cage in it. We use it for our backup server.
However, I've not kept up to date with what 3Ware offers now, so I'm not sure about space upwards to 15TB. Check them out though.
You could get a 16-port 3ware controller and hang a bunch of 2TB drives off of it. Even with fairly decent redundancy you should be able to get at least 16TB of usable storage out of it. It might not have the greatest performance though (which gets back to this box's intended purpose).
A colleague recently bought an 8-port box similar to the above for just about $5k US, with just under 12TB usable.
--keith
Boris Epstein wrote:
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Why not just get a SAN appliance, and then attach it to your CentOS server with iSCSI. My company is getting ready to do the same. We have the hardware in place, just haven't had time to hook it all up and spin the thing up. We purchased an IBM SAN, and then we'll attach it to an older xSeries 235 server running CentOS.
Or are you looking for some cheaper solutions?
Regards, Max
On Wed, Jan 6, 2010 at 4:40 PM, Max Hetrick maxhetrick@verizon.net wrote:
Boris Epstein wrote:
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Why not just get a SAN appliance, and then attach it to your CentOS server with iSCSI. My company is getting ready to do the same. We have the hardware in place, just haven't had time to hook it all up and spin the thing up. We purchased an IBM SAN, and then we'll attach it to an older xSeries 235 server running CentOS.
Or are you looking for some cheaper solutions?
Regards, Max _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Max,
Roughly how much space does the appliance provide? And how much did it cost?
Thanks.
Boris.
Boris Epstein wrote:
Roughly how much space does the appliance provide? And how much did it cost?
This one was only configured with 2TB. It and the drives were like 4 grand or something. Of course IBM stuff is expensive, and you can get all sorts of size configurations.
I'm currently starting to research for a cheaper SAN to attach to my backup server because I'm running out of room, but haven't gotten too far in the research. For this purpose, I won't need the speeds of a high performance one.
Regards, Max
Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends what you want to do ... i am satisfied with an via vb8001 with nano cpu at 1.6 ghz (thou i still have problems with power scaling) and an areca 1220 ... so i have 2 hdd in software raid 1 (motherboard) + 8 hdd in raid 6 from areca ... the drives are kept in 2x 5 hdd supermicro racks ... this is a home samba and http server ... Adrian
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
JD
On Thu, Jan 7, 2010 at 11:34 AM, John Doe jdmls@yahoo.com wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
JD
SuperMicro also has good 1U, or 2U servers with plenty hard drive space, and very affordable
John Doe wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
Somebody said something about Sun servers being pricey and that quality was going downhill...something about cheap controllers...any comments on this?
BTW, the Sun X4540 can only be bought with all disks loaded. So it is not up to 48 but must be 48 in 4U.
Quoting Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk:
John Doe wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
Somebody said something about Sun servers being pricey and that quality was going downhill...something about cheap controllers...any comments on this?
BTW, the Sun X4540 can only be bought with all disks loaded. So it is not up to 48 but must be 48 in 4U.
Atleast old sun fire servers are using cheap ata controllers ...
My own solution is based on supermicro server with areca raid controller.
case: ATX Supermicro SC836TQ-R800V (16xSAS) with areca 16 port sata raid controller.(Areca ARC-1261ML)
You can easily buy cheap sata disks from 1 to 2TB * 16
-- Eero, RHCE
On Thu, Jan 7, 2010 at 8:08 AM, Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk wrote:
John Doe wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
Somebody said something about Sun servers being pricey and that quality was going downhill...something about cheap controllers...any comments on this?
We have a bunch of X4540 Netbackup media servers running Solaris 10 + ZFS. While I can't comment on all of the controllers Sun uses, the SATA chipset / controllers in the 4540 seem to be pretty solid so far (our backup servers process 20TB+ of data each day).
- Ryan -- http://prefetch.net
On Thu, Jan 7, 2010 at 11:09 AM, Matty matty91@gmail.com wrote:
On Thu, Jan 7, 2010 at 8:08 AM, Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk wrote:
John Doe wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set
up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
Somebody said something about Sun servers being pricey and that quality was going downhill...something about cheap controllers...any comments on this?
We have a bunch of X4540 Netbackup media servers running Solaris 10 + ZFS. While I can't comment on all of the controllers Sun uses, the SATA chipset / controllers in the 4540 seem to be pretty solid so far (our backup servers process 20TB+ of data each day).
- Ryan
-- http://prefetch.net _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Thanks Ryan!
What price range (roughly) are we talking here?
Boris.
Yes, the Sun Fire Xs are costly... Here, 35k euros for 48 x 1TB by example, or 22k for 48 x 500GB... Our 12TB HP is around 6k. So 12k for almost the same as the 22k But if you use 1TB disks on the Sun, you end up using half the Us (and save some power) in your bay; which might be nice if you are tight on physical space... It all depends on your needs. Note that I think the Sun Fire X uses software RAID (from zfs)...
JD
John Doe wrote:
Yes, the Sun Fire Xs are costly... Here, 35k euros for 48 x 1TB by example, or 22k for 48 x 500GB... Our 12TB HP is around 6k. So 12k for almost the same as the 22k But if you use 1TB disks on the Sun, you end up using half the Us (and save some power) in your bay; which might be nice if you are tight on physical space... It all depends on your needs. Note that I think the Sun Fire X uses software RAID (from zfs)...
Yes, the Sun Fire X4540 uses software raid but not necessarily zfs...if you install another operating system that is not Solaris or OpenSolaris, it won't be zfs.
Christopher Chan wrote:
Yes, the Sun Fire X4540 uses software raid but not necessarily zfs...if you install another operating system that is not Solaris or OpenSolaris, it won't be zfs.
the thing to note on the Thumper (X4540), each of those 48 SATA drives has its own channel to the system bus. I believe it uses 6 8-port SATA controllers, each attached to the Opteron's Hypertransport via PCI-Express x4. this means you can hit some really high aggregate IO speeds...
On 01/08/2010 12:53 AM, John R Pierce wrote:
Christopher Chan wrote:
Yes, the Sun Fire X4540 uses software raid but not necessarily zfs...if you install another operating system that is not Solaris or OpenSolaris, it won't be zfs.
the thing to note on the Thumper (X4540), each of those 48 SATA drives has its own channel to the system bus. I believe it uses 6 8-port SATA controllers, each attached to the Opteron's Hypertransport via PCI-Express x4. this means you can hit some really high aggregate IO speeds...
have you actually tried it ?
cause when I did - the x45xx's/zfs were between 18 to 20% slower on disk i/o alone compared with a supermicro box with dual areca 1220/xfs.
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
- KB
Karanbir Singh wrote:
On 01/08/2010 12:53 AM, John R Pierce wrote:
Christopher Chan wrote:
Yes, the Sun Fire X4540 uses software raid but not necessarily zfs...if you install another operating system that is not Solaris or OpenSolaris, it won't be zfs.
the thing to note on the Thumper (X4540), each of those 48 SATA drives has its own channel to the system bus. I believe it uses 6 8-port SATA controllers, each attached to the Opteron's Hypertransport via PCI-Express x4. this means you can hit some really high aggregate IO speeds...
have you actually tried it ?
cause when I did - the x45xx's/zfs were between 18 to 20% slower on disk i/o alone compared with a supermicro box with dual areca 1220/xfs.
I'd imagine that a x4540 setup as md nested raid1+0 should give the dual Areca a run for its money (okay...not quite in real money terms but you get the idea) and I wonder what zfs would be like on a dual Areca 1220 box.
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
I wonder how much that would change with a bbu NVRAM card for an external journal for ext4 and the disks on md. Unless one cannot add a bbu NVRAM card...
On 01/08/2010 01:58 AM, Christopher Chan wrote:
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
I wonder how much that would change with a bbu NVRAM card for an external journal for ext4 and the disks on md. Unless one cannot add a bbu NVRAM card...
Good question, they are after all ( the Sun 45xx's ) just opteron box's with a mostly standard build. Finding a CentOS compatible ( drivers pre-included, and not crap like cciss ) would not be too hard.
Who wants to offer up a machine to test on :)
Karanbir Singh schrieb:
On 01/08/2010 01:58 AM, Christopher Chan wrote:
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
I wonder how much that would change with a bbu NVRAM card for an external journal for ext4 and the disks on md. Unless one cannot add a bbu NVRAM card...
Good question, they are after all ( the Sun 45xx's ) just opteron box's with a mostly standard build. Finding a CentOS compatible ( drivers pre-included, and not crap like cciss ) would not be too hard.
With ZFS, the whole machine is the RAID-controller (basically). NVRAM in ZFS would be used for L2ARC. Of course, this ask for a sane system-desgin (which the thumpers do have, as mentioned - AFAIK, there are virtually no off-the-shelf motherboads that can offer the thumper's distribution of SATA-channels over HT-links.
CentOS wouldn't run bad on such a motherboard, either (and RHEL is supported). ;-)
cheers, Rainer
On Fri, Jan 08, 2010 at 12:33:39PM +0100, Rainer Duffner wrote:
Karanbir Singh schrieb:
On 01/08/2010 01:58 AM, Christopher Chan wrote:
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
I wonder how much that would change with a bbu NVRAM card for an external journal for ext4 and the disks on md. Unless one cannot add a bbu NVRAM card...
Good question, they are after all ( the Sun 45xx's ) just opteron box's with a mostly standard build. Finding a CentOS compatible ( drivers pre-included, and not crap like cciss ) would not be too hard.
With ZFS, the whole machine is the RAID-controller (basically). NVRAM in ZFS would be used for L2ARC. Of course, this ask for a sane system-desgin (which the thumpers do have, as mentioned - AFAIK, there are virtually no off-the-shelf motherboads that can offer the thumper's distribution of SATA-channels over HT-links.
CentOS wouldn't run bad on such a motherboard, either (and RHEL is supported). ;-)
Last time I checked only RHEL4 was support.. RHEL5 lacks (properly working) SATA driver for the controller used in the thumper.
Is RHEL5 supported/working nowadays?
-- Pasi
On Mon, Jan 11, 2010 at 03:00:41PM +0200, Pasi Kärkkäinen wrote:
On Fri, Jan 08, 2010 at 12:33:39PM +0100, Rainer Duffner wrote:
Karanbir Singh schrieb:
On 01/08/2010 01:58 AM, Christopher Chan wrote:
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
I wonder how much that would change with a bbu NVRAM card for an external journal for ext4 and the disks on md. Unless one cannot add a bbu NVRAM card...
Good question, they are after all ( the Sun 45xx's ) just opteron box's with a mostly standard build. Finding a CentOS compatible ( drivers pre-included, and not crap like cciss ) would not be too hard.
With ZFS, the whole machine is the RAID-controller (basically). NVRAM in ZFS would be used for L2ARC. Of course, this ask for a sane system-desgin (which the thumpers do have, as mentioned - AFAIK, there are virtually no off-the-shelf motherboads that can offer the thumper's distribution of SATA-channels over HT-links.
CentOS wouldn't run bad on such a motherboard, either (and RHEL is supported). ;-)
Last time I checked only RHEL4 was support.. RHEL5 lacks (properly working) SATA driver for the controller used in the thumper.
Is RHEL5 supported/working nowadays?
It seems X4500 (not available anymore) had Marvell SATA controllers, that are not supported with RHEL5.
X4540 uses LSI SATA controllers, that are supported.
-- Pasi
Am 11.01.2010 15:26, schrieb Pasi Kärkkäinen:
It seems X4500 (not available anymore) had Marvell SATA controllers, that are not supported with RHEL5.
X4540 uses LSI SATA controllers, that are supported.
Indeed:
http://www.sun.com/servers/x64/x4540/os.jsp
5.3+ is needed.
Of course, for a true Solaris-admin, this would be a big waste. ;-) But if you have an application that runs on Linux (but not Solaris) or runs much more stable on Linux, this is a viable option.
Regards, Rainer
On 01/11/2010 09:42 AM, Rainer Duffner wrote:
Am 11.01.2010 15:26, schrieb Pasi Kärkkäinen:
X4540 uses LSI SATA controllers, that are supported.
Indeed:
http://www.sun.com/servers/x64/x4540/os.jsp
5.3+ is needed.
Of course, for a true Solaris-admin, this would be a big waste. ;-) But if you have an application that runs on Linux (but not Solaris) or runs much more stable on Linux, this is a viable option.
CentOS 5.4 x86_64 works fine on the x4540s, I've installed it myself and didn't have to do anything special to see and use all of the disks.
In my testing, the IO was faster and the storage easier to administer with when using Solaris and ZFS rather than with CentOS and software raid. That kind of box is just made for ZFS.
Tom
On Mon, 2010-01-11 at 16:16 -0500, Tom Georgoulias wrote:
CentOS 5.4 x86_64 works fine on the x4540s, I've installed it myself and didn't have to do anything special to see and use all of the disks.
In my testing, the IO was faster and the storage easier to administer with when using Solaris and ZFS rather than with CentOS and software raid. That kind of box is just made for ZFS.
Tom
--- Interesting, was the CentOS Box Tuned in any way?
John
On 01/12/2010 12:20 PM, JohnS wrote:
On Mon, 2010-01-11 at 16:16 -0500, Tom Georgoulias wrote:
CentOS 5.4 x86_64 works fine on the x4540s, I've installed it myself and didn't have to do anything special to see and use all of the disks.
In my testing, the IO was faster and the storage easier to administer with when using Solaris and ZFS rather than with CentOS and software raid. That kind of box is just made for ZFS.
Interesting, was the CentOS Box Tuned in any way?
Not really, everything was just setup using the defaults. I had a x4540 with 48 1TB drives and tried to recreate the raidz2 setup that I had tested with Solaris/zfs. I created six 6-disk RAID6 md devices, then added them into a single volume group and created a huge (5TB) logical volume with an XFS filesystem. I tried use one disk from each controller in each md device (as indicated by the hdtool that sun provides). I ran a variety of tests using tools like dd, iozone, tiobench, and sysbench and just watched how the server behaved. From what I could tell, the IO wasn't evenly spread across the disks in the md devices, just a subset.
I probably could've tweaked it some more, but I didn't have too much time to spend on it at the time. The Sun Storage 7000 series servers were a better fit for that project anyway.
Tom
Pasi Kärkkäinen wrote:
It seems X4500 (not available anymore) had Marvell SATA controllers, that are not supported with RHEL5.
And those marvell controllers caused major grief for Sun, especially when Solaris added support for NCQ somewhere in there. under heavy IO workloads, the controllers would just hang. Some nasty bugs. Driver software workarounds caused a big performance hit.
On 1/11/2010 11:38 AM, John R Pierce wrote:
Pasi Kärkkäinen wrote:
It seems X4500 (not available anymore) had Marvell SATA controllers, that are not supported with RHEL5.
And those marvell controllers caused major grief for Sun, especially when Solaris added support for NCQ somewhere in there. under heavy IO workloads, the controllers would just hang. Some nasty bugs. Driver software workarounds caused a big performance hit.
Is that a different chipset than http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009 uses? I replaced a Paradise and Adaptec card with one of these (or maybe the PCI-E version) and Centos recognized it and worked better than with the two different cards.
On 1/11/2010 1:33 PM, Les Mikesell wrote:
On 1/11/2010 11:38 AM, John R Pierce wrote:
Pasi Kärkkäinen wrote:
It seems X4500 (not available anymore) had Marvell SATA controllers, that are not supported with RHEL5.
And those marvell controllers caused major grief for Sun, especially when Solaris added support for NCQ somewhere in there. under heavy IO workloads, the controllers would just hang. Some nasty bugs. Driver software workarounds caused a big performance hit.
Is that a different chipset than http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009 uses? I replaced a Paradise and Adaptec card with one of these (or maybe the PCI-E version) and Centos recognized it and worked better than with the two different cards.
http://linuxmafia.com/faq/Hardware/sata.html#marvell
it's fakeraid. I don't know if it's a different one than what was in there previously though..:)
Karanbir Singh wrote:
snip
Good question, they are after all ( the Sun 45xx's ) just opteron box's with a mostly standard build. Finding a CentOS compatible ( drivers pre-included, and not crap like cciss ) would not be too hard.
Who wants to offer up a machine to test on :)
-- Karanbir Singh
Karanbir,
what is wrong or what problems are you referring to with cciss please ?
tia
- rh
On 01/08/2010 05:28 PM, R-Elists wrote:
what is wrong or what problems are you referring to with cciss please ?
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
On Monday 11 January 2010, Karanbir Singh wrote:
On 01/08/2010 05:28 PM, R-Elists wrote:
what is wrong or what problems are you referring to with cciss please ?
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
I would certainly not go as far as saying that I like cciss, but, they are imho not much worse than other products. We currently have ~500T on p800 and it behaves quite well.
As for the specifics: - Management: hpacucli is certainly odd, but then again neither tw_cli (3ware) nor cli64 (areca) shines. - Performance: certainly not a strong point, but a p800 can sustain quite a bit more than the 1G ethernet link I need.
/Peter
Karanbir Singh wrote:
On 01/08/2010 05:28 PM, R-Elists wrote:
what is wrong or what problems are you referring to with cciss please ?
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
Karanbir Singh wrote:
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
just curious, which storage controllers in those DL380/360 servers? each of those numbes describes like 6 generations of x86 servers.
if I want a lots-of-2.5" SAS dual e5500 kinda server, where should I go if HP's storage is so broken?
On Tuesday 12 January 2010, John R Pierce wrote:
Karanbir Singh wrote:
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
just curious, which storage controllers in those DL380/360 servers? each of those numbes describes like 6 generations of x86 servers.
if I want a lots-of-2.5" SAS dual e5500 kinda server, where should I go if HP's storage is so broken?
As I replied to another post, I think it's unfair to say that HP storage is broken. We have roughly: 30 p400 (mostly 2x raid1) 50 p800 (mostly 12x1T raid6) <10 other cciss
and the only problems we really suffer is: * performance is not great * /dev/cciss is not a scsi dev (which is a minor annoyance in linux at times) * 1T seagate drives fail at many times the rate of hitachi
/Peter
From: Karanbir Singh mail-lists@karan.org
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
On the other hand, here, we have around 30 HP servers. Some DL360/380/180 G5/G6 with CentOS 4/5 and, in 2 years, only 3 drives failed... That's it; no other problems...
JD
On 01/12/2010 10:43 AM, John Doe wrote:
On the other hand, here, we have around 30 HP servers. Some DL360/380/180 G5/G6 with CentOS 4/5 and, in 2 years, only 3 drives failed... That's it; no other problems...
Drives is hardly the issue - most of them are going to be seagate anyway.
My main issue with that kit is that the linux drivers are very basic, lack most management capabilities and fail often with obscure issues. And, as Peter pointed out already, they are not really exposing a proper scsi interface, but modeled around a really old ata stack.
- KB
On Thu, Jan 14, 2010 at 08:07:43PM +0000, Karanbir Singh wrote:
On 01/12/2010 10:43 AM, John Doe wrote:
On the other hand, here, we have around 30 HP servers. Some DL360/380/180 G5/G6 with CentOS 4/5 and, in 2 years, only 3 drives failed... That's it; no other problems...
Drives is hardly the issue - most of them are going to be seagate anyway.
My main issue with that kit is that the linux drivers are very basic, lack most management capabilities and fail often with obscure issues. And, as Peter pointed out already, they are not really exposing a proper scsi interface, but modeled around a really old ata stack.
HP has new 'hpsa' driver, that is based on the SCSI stack.
-- Pasi
Karanbir Singh wrote:
My main issue with that kit is that the linux drivers are very basic, lack most management capabilities and fail often with obscure issues. And, as Peter pointed out already, they are not really exposing a proper scsi interface, but modeled around a really old ata stack.
One thing HP has attempted to do with all their smartarray cards is maintain RAID volume set compatability, so you can move a raid set from a failed server to another raid controller that has the same interface even if its a different controller.. most of the older cards at least were based on various megaraid hardware, but with custom HP firmware. On many of their systems, they actually put the battery backup write cache with the batteries on a little paddle card, and if a server fails, you can move the drives with that BBWC to a new server, and poof, its all good, the write cache is flushed to the drives and everyone is happy.
they definitely have some annoying habits. I've got one server that has 4 bays on a split SCSI bus (bay 0/1 are one scsi bus, bay 2/3 are another). I created a raid1 with volumes 0,1 and thought gee, I should use bays 0,2 instead. well< I never figured out how to do it. I tried adding 2 as a hotspare, then manually failing the drive in 1 (eg, yanking it out), the raid rebuilt with the spare, but its status was 'failed w/ spare' and when I replaced the disk in bay 1, it restriped back to 0,1
On Thursday 14 January 2010, John R Pierce wrote:
Karanbir Singh wrote:
My main issue with that kit is that the linux drivers are very basic, lack most management capabilities and fail often with obscure issues.
We certainly don't see a high frequency of obscure-cciss-issues. But since no place and/or work-load is the same I guess this difference isn't too unexpected.
And, as Peter pointed out already, they are not really exposing a proper scsi interface, but modeled around a really old ata stack.
One thing HP has attempted to do with all their smartarray cards is maintain RAID volume set compatability, so you can move a raid set from a failed server to another raid controller that has the same interface even if its a different controller..
This is far from always true, for example, raid6 sets are incompatible between quite similar controllers according to HP documentation.
/Peter
Karanbir Singh wrote:
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
Eeek! That thing will be hosting the school's vle. Looks like I better memorize the after hours password for HP support.
What problems did you have? Do they occur mostly when the boxes are under high I/O load?
This is really new to me as I had no problems with a DL360 G3 box that ran Windows 2000 and Exchange 2000 with regards to disk problems in my previous job.
2010/1/12 Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk:
Eeek! That thing will be hosting the school's vle. Looks like I better memorize the after hours password for HP support.
I have had lots[1] of problems lately with DIMMs becoming defective in six month old G5 HPs. Could just be bad luck or maybe just put together by someone wearing a shell suit.
Ben
[1] Okay, three or four defective modules all in the space of a month.
Benjamin Donnachie wrote:
2010/1/12 Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk:
Eeek! That thing will be hosting the school's vle. Looks like I better memorize the after hours password for HP support.
I have had lots[1] of problems lately with DIMMs becoming defective in six month old G5 HPs. Could just be bad luck or maybe just put together by someone wearing a shell suit.
Boy, a Tyan or Supermicro solution is looking better by the minute for the new server I plan to get the school for its library server and other uses. If only Supermicro had a local distributor...I have not had a good look at their solutions yet because of that but their 45 disk case has got my attention.
2010/1/12 Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk:
Boy, a Tyan or Supermicro solution is looking better by the minute for the new server I plan to get the school for its library server and other uses. If only Supermicro had a local distributor...I have not had a good look at their solutions yet because of that but their 45 disk case has got my attention.
To HP's credit, I had replacements within hours but we are on their 24x7x4 support contract.
Ben
On Tue, Jan 12, 2010 at 09:41:19AM +0000, Karanbir Singh wrote:
On 12/01/10 00:02, Christopher Chan wrote:
problems mostly centered around management and performance issues. the world is littered with stores of cciss fail
Really? Man, I have been given this spanking new HP DL370 G6 and running Centos 5.4 on it...
I've got a couple of DL380's at one setup and another 12 DL360's at another place. We have had enough problems with interfaces that all the machines are now running off remote-storage. Our storage incident rate has gone from 1/day average to under 2/month since then.
all of these machines are G4 and G5's running CentOS-5/x86_64
And I've been running DL380 and DL360 G3/G4 servers for years without problems.. with CentOS and Xen.. using cciss local storage. :)
-- Pasi
Pasi Kärkkäinen wrote:
And I've been running DL380 and DL360 G3/G4 servers for years without problems.. with CentOS and Xen.. using cciss local storage. :)
I've used HP/cciss on a couple hundred systems over the past 7 years, can only recall 2 issues, both around a drive failing the controller didn't force the drive off line, and there was no way to force it off line using the command line tool, so had to go on site and physically yank the drive. With the failing drive still in the array performance tanked.
I've never used SATA on cciss, that could be a cause of issues.
Used several hundred 3ware controllers too, only a few issues.
My own opinion is I use proper storage arrays for storage, and only use server-based controllers for the OS and basic stuff. And for my VMware systems they are all boot from SAN. Will be trying boot from SAN on CentOS 5.4 again soon, it didn't work properly on 5.2.
I feel much more comfortable with separation of storage from the server for my important data.
nate
On 01/12/2010 03:51 PM, nate wrote:
I've used HP/cciss on a couple hundred systems over the past 7 years, can only recall 2 issues, both around a drive failing the controller didn't force the drive off line, and there was no way to force it off line using the command line tool, so had to go on site and physically yank the drive. With the failing drive still in the array performance tanked.
I've had 2 issues today on a single machine - controller going readonly with the assumption that a disk had failed, looking at the cli stats and it actually showed all 6 disks in the dl380 G4 to be fine.
I accept these are not new machines, and have a few years under their belt - but we've got similar aged kit from other places which doesnt have nearly as many issues.
I've never used SATA on cciss, that could be a cause of issues.
Were all scsi here.
Maybe its just bad luck here :)
- KB
On Thu, Jan 14, 2010 at 08:14:52PM +0000, Karanbir Singh wrote:
On 01/12/2010 03:51 PM, nate wrote:
I've used HP/cciss on a couple hundred systems over the past 7 years, can only recall 2 issues, both around a drive failing the controller didn't force the drive off line, and there was no way to force it off line using the command line tool, so had to go on site and physically yank the drive. With the failing drive still in the array performance tanked.
I've had 2 issues today on a single machine - controller going readonly with the assumption that a disk had failed, looking at the cli stats and it actually showed all 6 disks in the dl380 G4 to be fine.
I accept these are not new machines, and have a few years under their belt - but we've got similar aged kit from other places which doesnt have nearly as many issues.
I've never used SATA on cciss, that could be a cause of issues.
Were all scsi here.
Maybe its just bad luck here :)
I remember a story about two similar HP proliants.. same model number, ordered the same day, same hardware configuration etc..
The other one had problems with a lot of things, while the other one was working perfectly well..
Looking at the serial numbers in more detail revealed the non-working one was assembled in Malaysia, while the working one was assembled in Ireland.. (iirc). :)
-- Pasi
On Thursday 14 January 2010, Pasi Kärkkäinen wrote:
On Thu, Jan 14, 2010 at 08:14:52PM +0000, Karanbir Singh wrote:
...
Maybe its just bad luck here :)
I remember a story about two similar HP proliants.. same model number, ordered the same day, same hardware configuration etc..
The other one had problems with a lot of things, while the other one was working perfectly well..
Looking at the serial numbers in more detail revealed the non-working one was assembled in Malaysia, while the working one was assembled in Ireland.. (iirc). :)
IMO the most likely reason for one server working and not another one would be HP shipping (or bounce-your-servers-around-the-globe as I like to call it)...
/Peter
2010/1/15 Peter Kjellstrom cap@nsc.liu.se:
IMO the most likely reason for one server working and not another one would be HP shipping (or bounce-your-servers-around-the-globe as I like to call it)...
Sadly that problem does not seem unique to HP.
Ben
On Fri, Jan 15, 2010 at 11:02:51AM +0100, Peter Kjellstrom wrote:
On Thursday 14 January 2010, Pasi Kärkkäinen wrote:
On Thu, Jan 14, 2010 at 08:14:52PM +0000, Karanbir Singh wrote:
...
Maybe its just bad luck here :)
I remember a story about two similar HP proliants.. same model number, ordered the same day, same hardware configuration etc..
The other one had problems with a lot of things, while the other one was working perfectly well..
Looking at the serial numbers in more detail revealed the non-working one was assembled in Malaysia, while the working one was assembled in Ireland.. (iirc). :)
IMO the most likely reason for one server working and not another one would be HP shipping (or bounce-your-servers-around-the-globe as I like to call it)...
Uhm, should have been more clear about that, I didn't mean shipping/transportation problems, there was some feature that didn't work in the other one, because the motherboard was different iirc.
I can't remember the details anymore.
-- Pasi
cause when I did - the x45xx's/zfs were between 18 to 20% slower on disk i/o alone compared with a supermicro box with dual areca 1220/xfs.
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
Speaking of thumpers and Supermicro, it looks like Supermicro has a very close answer to the thumper.
http://www.supermicro.com/storage/
The SC847 has an existing configuration that goes up to 36disks and another that will go up to 45 in 4U of space although the 45 disk case does not seem to have a page just yet.
Christopher Chan schrieb:
cause when I did - the x45xx's/zfs were between 18 to 20% slower on disk i/o alone compared with a supermicro box with dual areca 1220/xfs.
the thumpers make for decent backup or vtl type roles, not so much for online high density storage.
Speaking of thumpers and Supermicro, it looks like Supermicro has a very close answer to the thumper.
http://www.supermicro.com/storage/
The SC847 has an existing configuration that goes up to 36disks and another that will go up to 45 in 4U of space although the 45 disk case does not seem to have a page just yet. _
" Maximum 3.5" hot-swap drives density 36x (24 front + 12 rear) HDD bays"
http://www.supermicro.com/products/chassis/4U/847/SC847A-R1400.cfm
Did anybody else think "WTF?" when you saw that picture?
I have seen crazy stuff, but that one is pretty high-up on the list....
Doesn't that make cooling problematic?
Rainer
Quoting Rainer Duffner rainer@ultra-secure.de:
" Maximum 3.5" hot-swap drives density 36x (24 front + 12 rear) HDD bays"
http://www.supermicro.com/products/chassis/4U/847/SC847A-R1400.cfm
Did anybody else think "WTF?" when you saw that picture?
I have seen crazy stuff, but that one is pretty high-up on the list....
Doesn't that make cooling problematic?
Well, no with lot of cooling fans ?
-- Eero
" Maximum 3.5" hot-swap drives density 36x (24 front + 12 rear) HDD bays"
http://www.supermicro.com/products/chassis/4U/847/SC847A-R1400.cfm
Did anybody else think "WTF?" when you saw that picture?
I have seen crazy stuff, but that one is pretty high-up on the list....
Doesn't that make cooling problematic?
And what do you think of the arrangement of the 48 disks in a thumper?
Anyway, you can have cooling problems even with 2U cases and just six disks if you have faulty fans so I do not really see a problem with the Supermicro case. You just need working fans.
A bit OT but you did you ever see http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-chea...
On Thu, Jan 7, 2010 at 11:25 AM, Boris Epstein borepstein@gmail.com wrote:
On Thu, Jan 7, 2010 at 11:09 AM, Matty matty91@gmail.com wrote:
On Thu, Jan 7, 2010 at 8:08 AM, Chan Chung Hang Christopher christopher.chan@bradbury.edu.hk wrote:
John Doe wrote:
From: Boris Epstein borepstein@gmail.com
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Depends on your budget. Here, we use HP DL180 servers (12 x 1TB disks in 2U)... You can also check Sun Fire X**** servers; up to 48 x 1TB in 4U...
Somebody said something about Sun servers being pricey and that quality was going downhill...something about cheap controllers...any comments on this?
We have a bunch of X4540 Netbackup media servers running Solaris 10 + ZFS. While I can't comment on all of the controllers Sun uses, the SATA chipset / controllers in the 4540 seem to be pretty solid so far (our backup servers process 20TB+ of data each day).
- Ryan
-- http://prefetch.net _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Thanks Ryan!
What price range (roughly) are we talking here?
I'm not sure what we paid for our servers (I recommend hardware purchases, and another group inside the company negotiates pricing and places the order), but they are really well engineered machines. There are probably cheaper solutions available, but you get what you pay for.
- Ryan -- http://prefetch.net
On 01/06/2010 09:35 PM, Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I would recommend dont-homebrew. Get a vendor to build you a 3/4U box, get a couple of quad core cpu's in, and enough ram to do you in-use buffers. Also dont go over 1 TiB in storage per spindle if you want to get even relatively reasonable performance ( even when in use as a filer box ).
Not long back, I had the chance to do some performance metrics on a dual Areca-16xx hosted 24x1TiB disk setup - and we tested it for various loads, running CentOS-5.4/x86_64 and it consistently outperformed the sun thumper box's, coming in about 1/4th the price.
The other thing to keep in mind is to estimate and prove the cpu processing capability and network capability you are going to need out of this machine and dont skim on that. Dont just get overly focused on just the hba and disk metrics.
- KB
On Thu, Jan 7, 2010 at 9:27 AM, Karanbir Singh mail-lists@karan.org wrote:
On 01/06/2010 09:35 PM, Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I would recommend dont-homebrew. Get a vendor to build you a 3/4U box, get a couple of quad core cpu's in, and enough ram to do you in-use buffers. Also dont go over 1 TiB in storage per spindle if you want to get even relatively reasonable performance ( even when in use as a filer box ).
Not long back, I had the chance to do some performance metrics on a dual Areca-16xx hosted 24x1TiB disk setup - and we tested it for various loads, running CentOS-5.4/x86_64 and it consistently outperformed the sun thumper box's, coming in about 1/4th the price.
The other thing to keep in mind is to estimate and prove the cpu processing capability and network capability you are going to need out of this machine and dont skim on that. Dont just get overly focused on just the hba and disk metrics.
- KB
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
Boris.
On 1/7/2010 9:30 AM, Boris Epstein wrote:
On Thu, Jan 7, 2010 at 9:27 AM, Karanbir Singhmail-lists@karan.org wrote:
On 01/06/2010 09:35 PM, Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I would recommend dont-homebrew. Get a vendor to build you a 3/4U box, get a couple of quad core cpu's in, and enough ram to do you in-use buffers. Also dont go over 1 TiB in storage per spindle if you want to get even relatively reasonable performance ( even when in use as a filer box ).
Not long back, I had the chance to do some performance metrics on a dual Areca-16xx hosted 24x1TiB disk setup - and we tested it for various loads, running CentOS-5.4/x86_64 and it consistently outperformed the sun thumper box's, coming in about 1/4th the price.
The other thing to keep in mind is to estimate and prove the cpu processing capability and network capability you are going to need out of this machine and dont skim on that. Dont just get overly focused on just the hba and disk metrics.
- KB
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
Boris. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
per HDD
On 01/07/2010 02:30 PM, Boris Epstein wrote:
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
disk. it boils down to how much data do you want to put under one read/write stream.
the other thing is that these days 1.5TB disks are the best bang-for-the-buck in terms of storage/cost. So maybe thats something to consider, and limit disk usage down initially - expand later as you need.
Even better if your hba can support that, if not then mdadm ( have lots of cpu right ? ), and make sure you understand recarving / reshaping before you do the final design. Refactoring filers with large quantities of data is no fun if you cant reshape and grow.
- KB
From: Karanbir Singh mail-lists@karan.org
On 01/07/2010 02:30 PM, Boris Epstein wrote:
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
disk. it boils down to how much data do you want to put under one read/write stream.
the other thing is that these days 1.5TB disks are the best bang-for-the-buck in terms of storage/cost. So maybe thats something to consider, and limit disk usage down initially - expand later as you need.
Even better if your hba can support that, if not then mdadm ( have lots of cpu right ? ), and make sure you understand recarving / reshaping before you do the final design. Refactoring filers with large quantities of data is no fun if you cant reshape and grow.
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
JD
On 1/7/2010 10:54 AM, John Doe wrote:
From: Karanbir Singhmail-lists@karan.org
On 01/07/2010 02:30 PM, Boris Epstein wrote:
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
disk. it boils down to how much data do you want to put under one read/write stream.
the other thing is that these days 1.5TB disks are the best bang-for-the-buck in terms of storage/cost. So maybe thats something to consider, and limit disk usage down initially - expand later as you need.
Even better if your hba can support that, if not then mdadm ( have lots of cpu right ? ), and make sure you understand recarving / reshaping before you do the final design. Refactoring filers with large quantities of data is no fun if you cant reshape and grow.
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
Which should also take care of the reliability issue to a large degree.
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
Which should also take care of the reliability issue to a large degree.
An often overlooked issue is the rebuild time with Linux software raid and all hw raid controllers I have seen. On large drives the times are so long as a result of the sheer size, if the array is degraded you are exposed during the rebuild. ZFS's resilver has this addressed as good as you can by only copying actual data.
With this in mind, it's wise to consider how you develop the redundancy into the solution...
On Thu, Jan 07, 2010 at 05:28:34PM +0000, Joseph L. Casale wrote:
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
Which should also take care of the reliability issue to a large degree.
An often overlooked issue is the rebuild time with Linux software raid and all hw raid controllers I have seen. On large drives the times are so long as a result of the sheer size, if the array is degraded you are exposed during the rebuild. ZFS's resilver has this addressed as good as you can by only copying actual data.
With this in mind, it's wise to consider how you develop the redundancy into the solution...
Very true... especially with 1TB+ drives you definitely are crazy to run anything other than RAID-6.
Lately we've been buying 24 bay systems from Silicon Mechanics, installing Solaris 10 and running RAID-Z2 + SSD for L2ARC and ZIL. Makes for great NFS storage...
The next release of Solaris 10 should have RAID-Z3 which might be better for the >1TB drives out there.
(You can of course do this with OpenSolaris as well and something similar on CentOS albeit not with ZFS).
When we need a little higher level of HA and "Enterprise-ness" we go NetApp. Just. Works. :)
Ray
On 01/07/2010 05:28 PM, Joseph L. Casale wrote:
An often overlooked issue is the rebuild time with Linux software raid and all hw raid controllers I have seen. On large drives the times are so long as a result of the sheer size, if the array is degraded you are exposed during the rebuild.
As a point of reference: On a SIL3512 SATA interface, mdraid raid1, 2 x 1 TB disks take just under 6 hrs to resync with moderate load ( this machine is a mailserver off a 5mbps link to the internet, and rarely hits > 0.3 load with just usage.
same machine with zero load and the mdadm resync rate bumped to 800000K/sec, it takes just over 3 hrs 42 min. But dont expect to be able to do anything with the machine during this time.
( I've just done these two syncs today! )
Disks are all : SAMSUNG HD103UJ , and are at 83% data space used of total capacity.
- KB
On 1/7/2010 12:28 PM, Joseph L. Casale wrote:
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
Which should also take care of the reliability issue to a large degree.
An often overlooked issue is the rebuild time with Linux software raid and all hw raid controllers I have seen. On large drives the times are so long as a result of the sheer size, if the array is degraded you are exposed during the rebuild. ZFS's resilver has this addressed as good as you can by only copying actual data.
With this in mind, it's wise to consider how you develop the redundancy into the solution...
Yah, RAID-5 is a bad idea anymore with the large drive sizes. RAID-6 or RAID-10 is a far better choice.
I prefer RAID-10 because the rebuild time is based on the size of a drive pair, not the entire array.
On Fri, Jan 08, 2010 at 11:25:08AM -0500, Thomas Harold wrote:
On 1/7/2010 12:28 PM, Joseph L. Casale wrote:
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
Which should also take care of the reliability issue to a large degree.
An often overlooked issue is the rebuild time with Linux software raid and all hw raid controllers I have seen. On large drives the times are so long as a result of the sheer size, if the array is degraded you are exposed during the rebuild. ZFS's resilver has this addressed as good as you can by only copying actual data.
With this in mind, it's wise to consider how you develop the redundancy into the solution...
Yah, RAID-5 is a bad idea anymore with the large drive sizes. RAID-6 or RAID-10 is a far better choice.
I prefer RAID-10 because the rebuild time is based on the size of a drive pair, not the entire array.
I have mixed feelings on RAID10... I like the extra speed it gives (especially more IOPS with ZFS), but at the same time, if you lose one drive, you're then _one_ drive failure away from losing your entire array. Of course, you'd have to be unlucky and have that drive failure occur on the other member of the mirror set that already suffered a failure....
Maybe doing three drives per RAID1 set would make me feel better (but waste a lot of space) :)
This is a good read:
http://queue.acm.org/detail.cfm?id=1670144
Ray
From: Thomas Harold thomas-lists@nybeta.com
Yah, RAID-5 is a bad idea anymore with the large drive sizes. RAID-6 or RAID-10 is a far better choice.
I prefer RAID-10 because the rebuild time is based on the size of a drive pair, not the entire array.
Anyone tested RAID 50 and/or 60 arrays?
"RAID-50 – Striping of Distributed Parity Arrays: RAID 50 is a combination of RAID 5 and RAID 0. A RAID 5 set must have at least three disks. RAID 50 strips data across each RAID 5 subset. RAID 50 provides a higher degree of fault tolerance since 1 drive per RAID 5 set may fail without data being lost. A performance increase over RAID 5 may be realized depending on the configuration due to fewer disks reads per parity calculation." "RAID-60 – Striping of Dual Parity Arrays: RAID 60 is striping over more than one span of physical disks that are configured as a RAID 6. RAID 60 strips data across each RAID 6 subset. RAID 60 provides a higher degree of fault tolerance since 2 drives per RAID 6 set may fail without data being lost. A performance increase over RAID 6 may be realized depending on the configuration due to fewer disks reads per parity calculation."
Supposed to be safer in big arrays where the big number of disks means higher possibility of failure...
JD
John Doe wrote:
From: Thomas Harold thomas-lists@nybeta.com
Yah, RAID-5 is a bad idea anymore with the large drive sizes. RAID-6 or RAID-10 is a far better choice.
I prefer RAID-10 because the rebuild time is based on the size of a drive pair, not the entire array.
Anyone tested RAID 50 and/or 60 arrays?
I've been running RAID 50 for years on my arrays, the SE from the vendor likes to call it RAID 500 though because of the additional layer of striping that they do. Likewise with RAID 10 he prefers to call it RAID 100.
http://www.techopsguys.com/2009/11/24/81000-raid-arrays/
Haven't tried RAID 60 yet, but will be able to in a couple weeks, not planing on using it since RAID 6 is lower performing than RAID 5 and due to the architecture I don't need to be concerned about a double disk failure during a rebuild(a major earthquake happening is probably just as likely).
http://www.techopsguys.com/2009/11/20/enterprise-sata-disk-reliability/
nate
On Thu, Jan 7, 2010 at 12:01 PM, Thomas Harold thomas-lists@nybeta.com wrote:
On 1/7/2010 10:54 AM, John Doe wrote:
From: Karanbir Singhmail-lists@karan.org
On 01/07/2010 02:30 PM, Boris Epstein wrote:
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
disk. it boils down to how much data do you want to put under one read/write stream.
the other thing is that these days 1.5TB disks are the best bang-for-the-buck in terms of storage/cost. So maybe thats something to consider, and limit disk usage down initially - expand later as you need.
Even better if your hba can support that, if not then mdadm ( have lots of cpu right ? ), and make sure you understand recarving / reshaping before you do the final design. Refactoring filers with large quantities of data is no fun if you cant reshape and grow.
I also heard that disks above 1TB might have reliability issues. Maybe it changed since then...
I remember rumors about the early 2TB Seagates.
Personally, I won't RAID SATA drives over 500GB unless they're enterprise-level ones with the limits on how long before the drive reports a problem back to the host when it has a read error.
I'm with you on that one. We currently use RAIDZ2 to allow us to lose 2 drives in our storage pools, and will definitely move to RAIDZ3 at some point down the road. Luckily for us ZFS re-silvers just the blocks that contain data / parity when a failure occurs, so a disk failure is usually remedied in an hour or two (we devote two disks as spares).
- Ryan -- http://prefetch.net
At Thu, 7 Jan 2010 09:30:17 -0500 CentOS mailing list centos@centos.org wrote:
On Thu, Jan 7, 2010 at 9:27 AM, Karanbir Singh mail-lists@karan.org wrote:
On 01/06/2010 09:35 PM, Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I would recommend dont-homebrew. Get a vendor to build you a 3/4U box, get a couple of quad core cpu's in, and enough ram to do you in-use buffers. Also dont go over 1 TiB in storage per spindle if you want to get even relatively reasonable performance ( even when in use as a filer box ).
Not long back, I had the chance to do some performance metrics on a dual Areca-16xx hosted 24x1TiB disk setup - and we tested it for various loads, running CentOS-5.4/x86_64 and it consistently outperformed the sun thumper box's, coming in about 1/4th the price.
The other thing to keep in mind is to estimate and prove the cpu processing capability and network capability you are going to need out of this machine and dont skim on that. Dont just get overly focused on just the hba and disk metrics.
- KB
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
KB, thanks. When you say "dont go over 1 TiB in storage per spindle" what are you referring to as spindle?
Generally 'spindle' == 'physical disk drive'.
Boris. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 1/6/2010 2:35 PM, Boris Epstein wrote:
we are trying to set up some storage servers to run under Linux
You should also consider FreeBSD 8.0, which has the newest version of ZFS up and running stably on it. I use Linux for most server tasks, but for big storage, Linux just doesn't have anything like this yet.
Yeah, yeah, btrfs someday, I know. But today, ZFS is where it's at. It's the easiest way of managing large pools of storage I know of short of a Drobo, and Drobos have big problems of their own.
I'm not recommending OpenSolaris on purpose. For the last few years, it was the only stable production-quality implementation of ZFS, but with FreeBSD 8.0, it just lost that advantage. I think, as a Linux fan, you will be happier with FreeBSD than OpenSolaris.
storage volume would be in the range specified: 8-15 TB.
That puts you right on the edge of workability with 32-bit hardware. ext3's limit on 32-bit is 8 TB, and you can push it to 16 TB by switching to XFS or JFS. Best to use 64-bit hardware if you can.
Warren Young wrote:
On 1/6/2010 2:35 PM, Boris Epstein wrote:
we are trying to set up some storage servers to run under Linux
You should also consider FreeBSD 8.0, which has the newest version of ZFS up and running stably on it. I use Linux for most server tasks, but for big storage, Linux just doesn't have anything like this yet.
http://security.freebsd.org/advisories/FreeBSD-SA-10:03.zfs.asc
Nothing really big but it does kinda leave doubts...interesting that FreeBSD has absorbed pf and zfs and now claims to be twice as fast as Linux for mysql/postgresql workloads. Certainly sounds very different from the FreeBSD 4.4 that I knew.
FreeBSD may now have ZFS support but it does not look quite the same as it does on Solaris/OpenSolaris.
I'm not recommending OpenSolaris on purpose. For the last few years, it was the only stable production-quality implementation of ZFS, but with FreeBSD 8.0, it just lost that advantage. I think, as a Linux fan, you will be happier with FreeBSD than OpenSolaris.
Serious system administrators are not Linux fans I don't think. I tend to want to use the right tool for the job like OpenBSD for firewalling for example. I don't know about you but I find pkg on OpenSolaris to be more akin to yum or apt than ports and then there is always nexenta if I really want a complete GNU userland and apt/dpkg. I could not find out much are ZFS on FreeBSD. Its man page is just a copy of the Solaris one. Does it support direct sharing/exporting as nfs/cifs/iscsi like it does on Solaris/OpenSolaris? Does it support using ZFS for booting and boot environments and a related upgrade system?
Nice that FreeBSD has improved its zfs support, I remember one person dissing zfs and pointing to vinum as an alternative but then maybe he did not know what he was talking about. However, there certainly is a lot more on vinum than there is on zfs in the FreeBSD manual.
storage volume would be in the range specified: 8-15 TB.
That puts you right on the edge of workability with 32-bit hardware. ext3's limit on 32-bit is 8 TB, and you can push it to 16 TB by switching to XFS or JFS. Best to use 64-bit hardware if you can.
Probably XFS if you want data guarantees on anything that is not a hardware raid card with bbu cache since JFS does not support barriers yet.
On 01/08/2010 01:01 AM, Christopher Chan wrote:
That puts you right on the edge of workability with 32-bit hardware. ext3's limit on 32-bit is 8 TB, and you can push it to 16 TB by switching to XFS or JFS. Best to use 64-bit hardware if you can.
Probably XFS if you want data guarantees on anything that is not a hardware raid card with bbu cache since JFS does not support barriers yet.
and lets not forget the presence of ext4 in C5
- KB
On 1/7/2010 6:01 PM, Christopher Chan wrote:
I'm not recommending OpenSolaris on purpose.
Serious system administrators are not Linux fans I don't think.
I think I must have been sent back in time, say to 1997 or so, because I can't possibly be reading this in 2010. I base this on the fact that your statement logically means there are no serious Linux sysadmins, which is of course is so much hooey that no one believes this any more in the time I come from. Therefore, I must have been sent far enough back in time that such statements were still uttered with complete seriousness.
I guess the other possibility is that someone's gatewayed a Usenet advocacy group to this list.
I find pkg on OpenSolaris to be more akin to yum or apt than ports
In some ways, sure. Ports is definitely a different way of doing things, though, I think, not a bad one.
There are several areas where OpenSolaris' package system falls down:
1. No free updates. Even if you just want security fixes, you have to buy a support contract. (If you think this is reasonable, why are you here on the CentOS list, a place for discussing a solution to a different but similar problem?)
2. There is no upgrade path from release to release other than "reinstall", and releases are spaced just 6 months apart. Between this and the previous problem, it means I have to reinstall my server every 6 months to keep up to date if I don't want to buy a support contract. Those serious sysadmins where you come from like this sort of thing? In my world, we prefer OSes with long term support so we can stay current on a release for years at a time.
3. The main package repo is pretty sparse. If you want anything even a little bit nonstandard you end up downloading sources from the Internet and compiling by hand, which may not even succeed since Solaris is down in the third tier or so of popularity these days. At least with FreeBSD's ports, you're pretty much guaranteed that it will build and install with "sudo make install clean", even chasing down dependencies for you automatically.
4. At least back in the 2008.05 and 2008.11 days when I last tried to really use OpenSolaris, I found IPS to be quite immature. I managed, twice, to render a machine unbootable simply by removing packages I thought I didn't need, using the GUI package manager. No warnings, just boot...bang. Now maybe I'm being unrealistic, but I would think one of the basic requirements for a package manager is that it know enough about dependencies to refuse to let me uninstall core system components.
After discovering all that, I'm afraid I rather lost interest in trying to make serious use of OpenSolaris. I keep a VM of it around merely to test compatibility with a free software project I maintain. I won't install it on anything critical now, not without taking the time to do a complete reeval of it, anyway. It's been a year...maybe it's time.
and then there is always nexenta if I really want a complete GNU userland and apt/dpkg.
How many different machines have you tried it on? Perhaps you have been lucky, and have found that it installs on everything you want it to run on.
In my experience, both NCP and NexentaStor made me jump through quite a few hoops to find a hardware configuration they were happy with. Even after I got them working, neither seemed valuable enough to bother sticking with them, compared to OSes I already know and trust to just run.
Does it support direct sharing/exporting as nfs/cifs/iscsi
NFS, yes, that's how I'm using it.
CIFS, no, as there is no CIFS support in FreeBSD's kernel. Of course, you can always just use Samba.
iSCSI, no, because there isn't yet any iSCSI serving support in FreeBSD of any kind. Since I didn't want my ZFS pools to be directly attached to another machine, but rather shared among multiple machines in traditional file-server manner, this didn't cause a problem for me.
Let me bounce this ball back in your court: how about AFS, for the Macs in your organization? ZFS has no direct support for it on either platform, but at least on FreeBSD and most Linuxes, it's a supported package, available on demand, already preconfigured for that system. All you have to do is do local customizations to the configuration, set it to start automatically, and you're done. With OpenSolaris, it's a fully manual process.
Does it support using ZFS for booting
Not as part of the OS installer, but it can be done:
http://lulf.geeknest.org/blog/freebsd/Setting_up_a_zfs-only_system/
This doesn't interest me because it shares the same limitation as on Solaris, which is that it will only work with a mirror. I don't want to dedicate two disks just to the OS if I want a RAID-Z pool for actual data.
My solution for high root FS reliability was to put it on a CF card. In addition to being solid state, it has a few side benefits:
- It lets me use an otherwise unused ATA connection.
- It's small enough that I can mount it in otherwise dead space in the chassis, instead of taking up a precious disk bay.
Once I got the system installed, I moved some top-level trees into dedicated ZFS pools, so my root filesystem is now quite small and rarely touched.
lot more on vinum than there is on zfs in the FreeBSD manual.
I did most of my FreeBSD ZFS setup using the Solaris ZFS Admin Guide PDF. Everything it asked me to do worked fine on FreeBSD.
Yes, I'm sure you can point to places where a thing will work on Solaris and not on FreeBSD, but I haven't found anything that actually *matters* to me yet.
Warren Young wrote:
On 1/7/2010 6:01 PM, Christopher Chan wrote:
I'm not recommending OpenSolaris on purpose.
Serious system administrators are not Linux fans I don't think.
I think I must have been sent back in time, say to 1997 or so, because I can't possibly be reading this in 2010. I base this on the fact that your statement logically means there are no serious Linux sysadmins,
Huh? How did YOU get from what I wrote to 'no serious Linux sysadmins'. I used to be Linux for everything too. Grow up.
which is of course is so much hooey that no one believes this any more in the time I come from. Therefore, I must have been sent far enough back in time that such statements were still uttered with complete seriousness.
Given that there are others besides me on this list that have pointed elsewhere other than Linux for stuff like firewalling and I suspect probably throughout the history of the list, I think you have a serious case of tinted corneas.
I guess the other possibility is that someone's gatewayed a Usenet advocacy group to this list.
I find pkg on OpenSolaris to be more akin to yum or apt than ports
In some ways, sure. Ports is definitely a different way of doing things, though, I think, not a bad one.
Right.
There are several areas where OpenSolaris' package system falls down:
- No free updates. Even if you just want security fixes, you have to
buy a support contract. (If you think this is reasonable, why are you here on the CentOS list, a place for discussing a solution to a different but similar problem?)
I do not know what you are talking about. No free updates? OpenSolaris happens to be open source FYI. Maybe you should first learn about something before you start dissing it. Fanboys like you are what give Linux a bad name.
Even Solaris has free updates and I have been able to go and download patches. Last I did that anyway. But recently I brought this up about Solaris on the OpenSolaris irc channel and there are tools that automate all this and again, you do not need a support contract. But good luck trying file bug reports or get support.
- There is no upgrade path from release to release other than
"reinstall", and releases are spaced just 6 months apart. Between this and the previous problem, it means I have to reinstall my server every 6 months to keep up to date if I don't want to buy a support contract. Those serious sysadmins where you come from like this sort of thing? In my world, we prefer OSes with long term support so we can stay current on a release for years at a time.
??? What on earth are you on about? Are you talking about Ubuntu? Fedora? ???
I have never "reinstalled" my two OpenSolaris servers and I have upgraded from one release to another smoothly.
- The main package repo is pretty sparse. If you want anything even a
little bit nonstandard you end up downloading sources from the Internet and compiling by hand, which may not even succeed since Solaris is down in the third tier or so of popularity these days. At least with FreeBSD's ports, you're pretty much guaranteed that it will build and install with "sudo make install clean", even chasing down dependencies for you automatically.
Hang on. I thought we were talking about OpenSolaris? I don't use Solaris because I don't like its 'dual' package management system.
- At least back in the 2008.05 and 2008.11 days when I last tried to
really use OpenSolaris, I found IPS to be quite immature. I managed, twice, to render a machine unbootable simply by removing packages I thought I didn't need, using the GUI package manager. No warnings, just boot...bang. Now maybe I'm being unrealistic, but I would think one of the basic requirements for a package manager is that it know enough about dependencies to refuse to let me uninstall core system components.
I like the 'I thought I didn't need' part. Especially since OpenSolaris gives you the ability to clone boot environments to experiment on so that if you do mess up, you can just boot back into a working boot environment. Sorry, no sympathy from me on this score if that is what you are going to base your reason not to use OpenSolaris at all.
and then there is always nexenta if I really want a complete GNU userland and apt/dpkg.
How many different machines have you tried it on? Perhaps you have been lucky, and have found that it installs on everything you want it to run on.
Nexenta? One. A Dell. alpha5 too. Had to drop it since the Nexenta guys were short handed and did not have an update to a release that had an iscsi fix that was necessary for exporting to Windows. I make a point to build hardware that works with the matrix of possible operating systems I will use.
In my experience, both NCP and NexentaStor made me jump through quite a few hoops to find a hardware configuration they were happy with. Even after I got them working, neither seemed valuable enough to bother sticking with them, compared to OSes I already know and trust to just run.
No comment. I have not had a reason to want to commit time and energy try Nexenta again after alpha5.
Does it support direct sharing/exporting as nfs/cifs/iscsi
NFS, yes, that's how I'm using it.
CIFS, no, as there is no CIFS support in FreeBSD's kernel. Of course, you can always just use Samba.
iSCSI, no, because there isn't yet any iSCSI serving support in FreeBSD of any kind. Since I didn't want my ZFS pools to be directly attached to another machine, but rather shared among multiple machines in traditional file-server manner, this didn't cause a problem for me.
Just pointing out that FreeBSD cannot completely take on all roles that a person may use OpenSolaris for.
Let me bounce this ball back in your court: how about AFS, for the Macs in your organization? ZFS has no direct support for it on either platform, but at least on FreeBSD and most Linuxes, it's a supported package, available on demand, already preconfigured for that system.
AFS? Hmm, all our Macs are standalone. Too bad. Anyway, I probably would have gone NFS if we were to roll out Macs everywhere instead of the Music room. Or given GlusterFS a shot. I have not had time to look at AFS.
All you have to do is do local customizations to the configuration, set it to start automatically, and you're done. With OpenSolaris, it's a fully manual process.
Does it support using ZFS for booting
Not as part of the OS installer, but it can be done:
http://lulf.geeknest.org/blog/freebsd/Setting_up_a_zfs-only_system/
This doesn't interest me because it shares the same limitation as on Solaris, which is that it will only work with a mirror. I don't want to dedicate two disks just to the OS if I want a RAID-Z pool for actual data.
My solution for high root FS reliability was to put it on a CF card. In addition to being solid state, it has a few side benefits:
It lets me use an otherwise unused ATA connection.
It's small enough that I can mount it in otherwise dead space in the
chassis, instead of taking up a precious disk bay.
Once I got the system installed, I moved some top-level trees into dedicated ZFS pools, so my root filesystem is now quite small and rarely touched.
But still no mirror. Not that it matters I guess in your case.
lot more on vinum than there is on zfs in the FreeBSD manual.
I did most of my FreeBSD ZFS setup using the Solaris ZFS Admin Guide PDF. Everything it asked me to do worked fine on FreeBSD.
Yes, I'm sure you can point to places where a thing will work on Solaris and not on FreeBSD, but I haven't found anything that actually *matters* to me yet.
Just because it does not matter to you does not mean you can say that FreeBSD can replace or be used in lieu of OpenSolaris when you do not know what is wanted.
On Fri, Jan 08, 2010 at 12:49:30PM +0800, Christopher Chan wrote:
Warren Young wrote:
On 1/7/2010 6:01 PM, Christopher Chan wrote:
...
zfs on *solaris *bsd is getting off topic, if you need to fight, please take that somewhere else.
Thanks,
Tru
Warren Young wrote:
On 1/6/2010 2:35 PM, Boris Epstein wrote:
we are trying to set up some storage servers to run under Linux
<snip>
Serious system administrators are not Linux fans I don't think. I tend
<snip> Dunno why you say that. Lessee, both google and maybe amazon run Linux; meanwhile, AT&T, where I worked for a couple of years, Trustwave, a root CA that I worked for earlier this year, and here at the US NIH, we run Linux.
And I do think of myself, my co-worker, and our manager as "serious systems administrators".
mark
m.roth@5-cent.us wrote:
Dunno why you say that. Lessee, both google and maybe amazon run Linux; meanwhile, AT&T, where I worked for a couple of years, Trustwave, a root CA that I worked for earlier this year, and here at the US NIH, we run Linux.
Since this is a storage thread.. back in 2004 I was told by an EMC rep that the Symmetrix ran linux(at least at the time, probably still does), up to 64 controllers or something. While at least at the time their lower end Clariion arrays ran windows.
My own 3PAR array which manages hundreds of terrabytes runs Debian, and their low end boxes(had a 9TB system at my last company) ran Debian as well.
A lot of network equipment(SAN+LAN+WAN) these days runs Linux as well.
In most cases though linux is used as a control interface, most of these products don't route data through the OS(less efficient).
Lastly if your thinking about ZFS check this post out I found it pretty interesting:
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg18898.html
nate
On 1/8/2010 10:09 AM, nate wrote:
m.roth@5-cent.us wrote:
Dunno why you say that. Lessee, both google and maybe amazon run Linux; meanwhile, AT&T, where I worked for a couple of years, Trustwave, a root CA that I worked for earlier this year, and here at the US NIH, we run Linux.
Since this is a storage thread.. back in 2004 I was told by an EMC rep that the Symmetrix ran linux(at least at the time, probably still does), up to 64 controllers or something. While at least at the time their lower end Clariion arrays ran windows.
My own 3PAR array which manages hundreds of terrabytes runs Debian, and their low end boxes(had a 9TB system at my last company) ran Debian as well.
A lot of network equipment(SAN+LAN+WAN) these days runs Linux as well.
In most cases though linux is used as a control interface, most of these products don't route data through the OS(less efficient).
Lastly if your thinking about ZFS check this post out I found it pretty interesting:
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg18898.html
These things are a little out of my league, but people in some other parts of the company seem to like them: http://www-03.ibm.com/systems/storage/disk/xiv/index.html I think the idea is that you get a rack full of drives to start and pay more as you use the space.
On Fri, Jan 08, 2010 at 11:06:10AM -0600, Les Mikesell wrote:
On 1/8/2010 10:09 AM, nate wrote:
m.roth@5-cent.us wrote:
Dunno why you say that. Lessee, both google and maybe amazon run Linux; meanwhile, AT&T, where I worked for a couple of years, Trustwave, a root CA that I worked for earlier this year, and here at the US NIH, we run Linux.
Since this is a storage thread.. back in 2004 I was told by an EMC rep that the Symmetrix ran linux(at least at the time, probably still does), up to 64 controllers or something. While at least at the time their lower end Clariion arrays ran windows.
My own 3PAR array which manages hundreds of terrabytes runs Debian, and their low end boxes(had a 9TB system at my last company) ran Debian as well.
A lot of network equipment(SAN+LAN+WAN) these days runs Linux as well.
In most cases though linux is used as a control interface, most of these products don't route data through the OS(less efficient).
Lastly if your thinking about ZFS check this post out I found it pretty interesting:
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg18898.html
These things are a little out of my league, but people in some other parts of the company seem to like them: http://www-03.ibm.com/systems/storage/disk/xiv/index.html I think the idea is that you get a rack full of drives to start and pay more as you use the space.
Out of curiosity, any idea what a full cabinet of one of these runs?
The product page doesn't seem to really describe the pricing strategy you mention though... sounds intriguing and I know Red Hat began using XIV storage (mentioned at Summit).
Ray
On 1/8/2010 11:41 AM, nate wrote:
Ray Van Dolson wrote:
Out of curiosity, any idea what a full cabinet of one of these runs?
Over $1M pretty easily, probably close/more than $2M.
I think you are confusing it with something else. Somewhere I saw that these list around $400k for 80TB - but the way IBM pricing usually works is that you get volume discounts or you give them someone else's quote to get the price down.
Les Mikesell wrote:
On 1/8/2010 11:41 AM, nate wrote:
Ray Van Dolson wrote:
Out of curiosity, any idea what a full cabinet of one of these runs?
Over $1M pretty easily, probably close/more than $2M.
I think you are confusing it with something else. Somewhere I saw that these list around $400k for 80TB - but the way IBM pricing usually works is that you get volume discounts or you give them someone else's quote to get the price down.
Depends on the config
# Up to 21 CPUs # Up to 72 Gbps internal switching capacity # Up to 120 GB cache memory # Up to 240 Gbps cache to disk bandwidth # Up to 180 VHDSR (Very High Density Slower Rotational), 1 TB, 7200 rpm disks # Up to 24 Fiber Channel ports offering 4 Gbps, 2Gbps or 1 Gbps multi-mode and single-mode support # Up to 6 iSCSI ports offering iSCSI over Gigabit Ethernet connectivity
The poster asked for a fully loaded config so I suspect the $1-2M price is reasonable with 21 CPUs and 120GB of cache.
For a system with so many fancy features I find it surprising it only seems to go to 180 disks, would be quite a bit more useful if it went to say 1000 disks(incrementally of course), maybe that's down the road.
nate
On Fri, 2010-01-08 at 14:36 -0800, nate wrote:
Les Mikesell wrote:
On 1/8/2010 11:41 AM, nate wrote:
Ray Van Dolson wrote:
Out of curiosity, any idea what a full cabinet of one of these runs?
Over $1M pretty easily, probably close/more than $2M.
I think you are confusing it with something else. Somewhere I saw that these list around $400k for 80TB - but the way IBM pricing usually works is that you get volume discounts or you give them someone else's quote to get the price down.
Depends on the config
# Up to 21 CPUs # Up to 72 Gbps internal switching capacity # Up to 120 GB cache memory # Up to 240 Gbps cache to disk bandwidth # Up to 180 VHDSR (Very High Density Slower Rotational), 1 TB, 7200 rpm disks # Up to 24 Fiber Channel ports offering 4 Gbps, 2Gbps or 1 Gbps multi-mode and single-mode support # Up to 6 iSCSI ports offering iSCSI over Gigabit Ethernet connectivity
---
Nate,
Just asking is the fiber ports BiDirectional or Directional or can they support a Bond that is BiDirectional of 4GB/s or can they be trunked into 16GB/s? Bidirectional. I need about 24 GB/s banwidth sustained, yes per second. Also what type of sparse file I/O you get . I see you stated multimode. Some don't classify that as true BiDirectional Bonding.
John
JohnS wrote:
Just asking is the fiber ports BiDirectional or Directional or can they support a Bond that is BiDirectional of 4GB/s or can they be trunked into 16GB/s? Bidirectional. I need about 24 GB/s banwidth sustained, yes per second. Also what type of sparse file I/O you get . I see you stated multimode. Some don't classify that as true BiDirectional Bonding.
24 gigabytes per second? Look to Fusion IO
http://www.fusionio.com/products/
Or TMS http://www.ramsan.com/products/products.htm
Depending on how much disk space you need and the nature of your application.
nate
On Fri, 2010-01-08 at 15:23 -0800, nate wrote:
JohnS wrote:
Just asking is the fiber ports BiDirectional or Directional or can they support a Bond that is BiDirectional of 4GB/s or can they be trunked into 16GB/s? Bidirectional. I need about 24 GB/s banwidth sustained, yes per second. Also what type of sparse file I/O you get . I see you stated multimode. Some don't classify that as true BiDirectional Bonding.
24 gigabytes per second? Look to Fusion IO
http://www.fusionio.com/products/
Or TMS http://www.ramsan.com/products/products.htm
Depending on how much disk space you need and the nature of your application.
nate
--- Currently using the older model of this one [1] @ 4GB/s on the fiber. Thats with BiDirectional, both links at 4 GB/s. Were looking for something to scale to 24 if not 30. It is in constant time wait also of about .30. It has what I call an ROI (return on investment) of 5 mins and longer, witch needs to be cut down greatly, 30 secs to a min. The application supports 128 CPUs of which it's a PACS Appication that runs in almost real time.
The bad thing just throwing money at storage is not going to work, we have to have a 30 - 90 day POC, period in house.
[1] http://store.shopfujitsu.com/fpc/Ecommerce/buildseriesbean.do?series=ETDX440...
John
JohnS wrote:
Currently using the older model of this one [1] @ 4GB/s on the fiber.
You sound pretty confused, there's no way in hell a Fujitsu DX440 is going to sustain 4 gigabytes/second, maybe 4 Gigabits/second (~500MB/s)
Thats with BiDirectional, both links at 4 GB/s. Were looking for something to scale to 24 if not 30. It is in constant time wait also of about .30. It has what I call an ROI (return on investment) of 5 mins and longer, witch needs to be cut down greatly, 30 secs to a min. The application supports 128 CPUs of which it's a PACS Appication that runs in almost real time.
Your ROI of 5 minutes doesn't make any sense to me.
The bad thing just throwing money at storage is not going to work, we have to have a 30 - 90 day POC, period in house.
The only way you'll get that is if you can clearly define your requirements and commit to buying a system if it meets those requirements. If you find out at the last minute that the requirements you came up with were wrong your SOL, so be careful.
There are plenty of arrays on the market that can go 6-8x faster than the DX440, none of them will come close to even 10GBytes/second though.
While an IOPS benchmark not a throughput benchmark it still has some value, you can look at the performance of a decked out DX440 here: http://www.storageperformance.org/results/a00010_Fujitsu_SPC1_executive_summ...
And compare it to other systems on the site. My own 3PAR T400 is rated to be about 6.5 times faster than the DX440 at least on the SPC-1 test, when fully loaded with 640 15k RPM drives(I use SATA drives exclusively).
nate
On Fri, 2010-01-08 at 16:08 -0800, nate wrote:
JohnS wrote:
Currently using the older model of this one [1] @ 4GB/s on the fiber.
You sound pretty confused, there's no way in hell a Fujitsu DX440 is going to sustain 4 gigabytes/second, maybe 4 Gigabits/second (~500MB/s)
G Bits per second
Thats with BiDirectional, both links at 4 GB/s. Were looking for something to scale to 24 if not 30. It is in constant time wait also of about .30. It has what I call an ROI (return on investment) of 5 mins and longer, witch needs to be cut down greatly, 30 secs to a min. The application supports 128 CPUs of which it's a PACS Appication that runs in almost real time.
Your ROI of 5 minutes doesn't make any sense to me.
Ok, Job submission and completion is what I am getting at.
The bad thing just throwing money at storage is not going to work, we have to have a 30 - 90 day POC, period in house.
The only way you'll get that is if you can clearly define your requirements and commit to buying a system if it meets those requirements. If you find out at the last minute that the requirements you came up with were wrong your SOL, so be careful.
There are plenty of arrays on the market that can go 6-8x faster than the DX440, none of them will come close to even 10GBytes/second though.
While an IOPS benchmark not a throughput benchmark it still has some value, you can look at the performance of a decked out DX440 here: http://www.storageperformance.org/results/a00010_Fujitsu_SPC1_executive_summ...
And compare it to other systems on the site. My own 3PAR T400 is rated to be about 6.5 times faster than the DX440 at least on the SPC-1 test, when fully loaded with 640 15k RPM drives(I use SATA drives exclusively).
nate
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
JohnS wrote:
On Fri, 2010-01-08 at 16:08 -0800, nate wrote:
JohnS wrote:
Currently using the older model of this one [1] @ 4GB/s on the fiber.
You sound pretty confused, there's no way in hell a Fujitsu DX440 is going to sustain 4 gigabytes/second, maybe 4 Gigabits/second (~500MB/s)
G Bits per second
Using 15K RPM drives I can tell you that a 3PAR T400(very well versed in their products, fast easy to use) can do 25.6 Gbits/second (3.2 gigabytes/second) sustained throughput. 640 drives, 48GB data cache.
If you were starting out at such a level I'm certain they would suggest you go with a T800 chassis which would allow you to double that performance. The T400 and T800 are identical with the exception of the backplane(~10GBytes/sec vs ~46GBytes/sec), the number of controllers(max of 4 vs max of 8), and the number of disks(640 vs 1280).
Took a while to find a publically available version of this but see page 8 of this pdf: http://www.dsw.cz/files/DSW09_Polcom_Prague%2024092009v2.pdf
nate
On Fri, 2010-01-08 at 17:09 -0800, nate wrote:
Using 15K RPM drives I can tell you that a 3PAR T400(very well versed in their products, fast easy to use) can do 25.6 Gbits/second (3.2 gigabytes/second) sustained throughput. 640 drives, 48GB data cache.
If you were starting out at such a level I'm certain they would suggest you go with a T800 chassis which would allow you to double that performance. The T400 and T800 are identical with the exception of the backplane(~10GBytes/sec vs ~46GBytes/sec), the number of controllers(max of 4 vs max of 8), and the number of disks(640 vs 1280).
Took a while to find a publically available version of this but see page 8 of this pdf: http://www.dsw.cz/files/DSW09_Polcom_Prague%2024092009v2.pdf
nate
--- Interesting link for info there. I found [1] and at the bottom of the page there is like tidbits of info in PDFs of the different models. Any idea where I could get more info than that, like data sheets and case studies.
My main problem is bandwidth and workload. Another problem is I am being told there is such a thing as Triple Disk Mirroring. Then I am being told that there is not and all it is only Raid 1 with just Replication to another Volume Set, sorta like a LVM Snapshot so to speak. Last thing is I am told GE is just rebranded Brocade. Thanks for the info & link.
[1] http://www.3par.com/library.html
John
JohnS wrote:
Interesting link for info there. I found [1] and at the bottom of the page there is like tidbits of info in PDFs of the different models. Any idea where I could get more info than that, like data sheets and case studies.
Not online at least, note the "Confidential" stuff at the bottom of every page I believe that data isn't supposed to be online, I have copies myself but don't post them directly.
They are more than happy to give this information to you directly though, not sure why they sort of try to protect it from public view, but it's a common practice, I can't recall ever seeing a data sheet on any storage array that showed it's performance numbers, unlike network gear which often is filled with performance related information.
If your referring to the E and S series arrays they are roughly half the performance of the F and T series. I don't believe E and S are being sold anymore as new systems, when the T came out for example it was only about 10-15% more than the S, and you get a ton more in performance, addressable capacity, and features than the S(or E), making it fairly pointless to go with an S or an E. I also suggest against the F200 as well, their systems are at their best when you have more than two controllers, any system can run on only two, so the extra online scalability comes at a small premium and gives good peace of mind, even if you never add the extra controllers.
My main problem is bandwidth and workload. Another problem is I am being told there is such a thing as Triple Disk Mirroring. Then I am being told that there is not and all it is only Raid 1 with just Replication to another Volume Set, sorta like a LVM Snapshot so to speak. Last thing is I am told GE is just rebranded Brocade. Thanks for the info & link.
There absolutely is triple mirroring, and probably quadruple as well, but most systems don't support it. Synchronous(and/or/both asynchronous) replication is of course available as well. Triple mirroring really is a waste of resources when your performing RAID at a sub-disk level like they(and some others IBM XIV, Compellent, Xiotech) do. True data availability would come from synchronous replication(up to 130 miles away I think is the limit, latency is the reason, ~1.3ms is max) to another system.
I'm not sure what "GE"(makes me think of General Electric) is but there are quite a few re branded Brocade switches out there, and probably Qlogic as well.
Their latest code release 2.3.1 just went GA yesterday, been shipping on new boxes for a few weeks, largest software update in the company's 10 year history, some pretty impressive new things, some of which I can't talk about just yet but they're more than happy to tell you in person.
I've been waiting for over a year for it myself so am pretty excited, upgrading next weekend.
Getting kind of OT though if you want more info email me off-list, always happy to talk about this kind of stuff I love fancy technology.
nate
On Sat, 2010-01-09 at 07:14 -0800, nate wrote:
JohnS wrote:
Interesting link for info there. I found [1] and at the bottom of the page there is like tidbits of info in PDFs of the different models. Any idea where I could get more info than that, like data sheets and case studies.
Not online at least, note the "Confidential" stuff at the bottom of every page I believe that data isn't supposed to be online, I have copies myself but don't post them directly.
They are more than happy to give this information to you directly though, not sure why they sort of try to protect it from public view, but it's a common practice, I can't recall ever seeing a data sheet on any storage array that showed it's performance numbers, unlike network gear which often is filled with performance related information.
If your referring to the E and S series arrays they are roughly half the performance of the F and T series. I don't believe E and S are being sold anymore as new systems, when the T came out for example it was only about 10-15% more than the S, and you get a ton more in performance, addressable capacity, and features than the S(or E), making it fairly pointless to go with an S or an E. I also suggest against the F200 as well, their systems are at their best when you have more than two controllers, any system can run on only two, so the extra online scalability comes at a small premium and gives good peace of mind, even if you never add the extra controllers.
My main problem is bandwidth and workload. Another problem is I am being told there is such a thing as Triple Disk Mirroring. Then I am being told that there is not and all it is only Raid 1 with just Replication to another Volume Set, sorta like a LVM Snapshot so to speak. Last thing is I am told GE is just rebranded Brocade. Thanks for the info & link.
There absolutely is triple mirroring, and probably quadruple as well, but most systems don't support it. Synchronous(and/or/both asynchronous) replication is of course available as well. Triple mirroring really is a waste of resources when your performing RAID at a sub-disk level like they(and some others IBM XIV, Compellent, Xiotech) do. True data availability would come from synchronous replication(up to 130 miles away I think is the limit, latency is the reason, ~1.3ms is max) to another system.
I'm not sure what "GE"(makes me think of General Electric) is but there are quite a few re branded Brocade switches out there, and probably Qlogic as well.
Their latest code release 2.3.1 just went GA yesterday, been shipping on new boxes for a few weeks, largest software update in the company's 10 year history, some pretty impressive new things, some of which I can't talk about just yet but they're more than happy to tell you in person.
I've been waiting for over a year for it myself so am pretty excited, upgrading next weekend.
Getting kind of OT though if you want more info email me off-list, always happy to talk about this kind of stuff I love fancy technology.
nate
--- Sure, will later on in the day.
Thanks,
John
Your ROI of 5 minutes doesn't make any sense to me.
Ok, Job submission and completion is what I am getting at.
ROI generally refers to the time an expense takes to pay off. Like, if buying $X worth of capital equipment will generate savings or additional income of $x over Y months, then Y is the ROI
your job cycle time, well, thats highly dependent on what your jobs /are/, something thats well outside the scope of this list.
On Fri, 2010-01-08 at 17:53 -0800, John R Pierce wrote:
Your ROI of 5 minutes doesn't make any sense to me.
Ok, Job submission and completion is what I am getting at.
ROI generally refers to the time an expense takes to pay off. Like, if buying $X worth of capital equipment will generate savings or additional income of $x over Y months, then Y is the ROI
funny lol hehe,,,, ROI where i work comes in at a secondary level for certain things. We get roi back in other ways and forms beside it being payed of in dollars. We are Not for Profit Organization.
your job cycle time, well, thats highly dependent on what your jobs /are/, something thats well outside the scope of this list.
Well not to far you know. There seems to be a few people on this list with a little know how of what I'm seeking info on. Really never accured to me until this thread came to light.
Exactly what I am getting at is, you plunk down $1.5 million on a SAN Box it dam* well better do what it is capable of. Essentially an Entity spends that kind of money, there has to not be ROI in dollars returned but returned in quicker job submission. Transaction time has to be cut in half or better.
John
JohnS wrote:
Just asking is the fiber ports BiDirectional or Directional or can they support a Bond that is BiDirectional of 4GB/s or can they be trunked into 16GB/s? Bidirectional. I need about 24 GB/s banwidth sustained, yes per second. Also what type of sparse file I/O you get . I see you stated multimode. Some don't classify that as true BiDirectional Bonding.
fiber is 4gbps (gigaBIT/sec) (or 1, 2, and now 8) and each FC link has two fibers, either of which can be used to transmit OR recieve at a given time (eg, each fiber is half duplex).. MOST implementations use one link to read and the other to write. A 4gbps fiber can typically sustain 400MByte/sec read or write, and potentially 400MByte/sec read *and* write.
To hit 24GBYTE/sec, yeouch. the IO busses on most servers don't have that kind of bandwidth. A PCI-Express 2.0 x16 (16 lane) card has 8GB/sec peak burst rates. The QPI bus on a Xeon 5500 server is around 6GT/s peak for all IO including CPU->memory, if all transfers are 8 bytes (64 bits), thats 48GB/sec.
BTW, in fiber, singlemode vs multimode refers to the optical modulation on the fiber and has nothing directly to do with the duplex or bonding. single mode is more expensive, can transmit longer distances (dozens of kilometers), while multimode is cheaper but only suitable for relatively short distances (100s of meters). Most all fiber channel devices use replaceable SFP transceivers, so you can use either type of transceiver with the appropriate fiber type.
On Fri, 2010-01-08 at 15:43 -0800, John R Pierce wrote:
JohnS wrote:
Just asking is the fiber ports BiDirectional or Directional or can they support a Bond that is BiDirectional of 4GB/s or can they be trunked into 16GB/s? Bidirectional. I need about 24 GB/s banwidth sustained, yes per second. Also what type of sparse file I/O you get . I see you stated multimode. Some don't classify that as true BiDirectional Bonding.
fiber is 4gbps (gigaBIT/sec) (or 1, 2, and now 8) and each FC link has two fibers, either of which can be used to transmit OR recieve at a given time (eg, each fiber is half duplex).. MOST implementations use one link to read and the other to write. A 4gbps fiber can typically sustain 400MByte/sec read or write, and potentially 400MByte/sec read *and* write.
To hit 24GBYTE/sec, yeouch. the IO busses on most servers don't have that kind of bandwidth. A PCI-Express 2.0 x16 (16 lane) card has 8GB/sec peak burst rates. The QPI bus on a Xeon 5500 server is around 6GT/s peak for all IO including CPU->memory, if all transfers are 8 bytes (64 bits), thats 48GB/sec.
Ok what about the Dell R7** series on an i7. It's capable of at least maybe 16gbits per second? That 16 maybe wrong though.
BTW, in fiber, singlemode vs multimode refers to the optical modulation on the fiber and has nothing directly to do with the duplex or bonding. single mode is more expensive, can transmit longer distances (dozens of kilometers), while multimode is cheaper but only suitable for relatively short distances (100s of meters). Most all fiber channel devices use replaceable SFP transceivers, so you can use either type of transceiver with the appropriate fiber type.
--- All that sounds correct.
I suggest you get a second-hand Sun X4500 if you're feeling cheap, http://www.sun.com/servers/x64/x4500/specs.xml. 48x 500G will do you nicely with some MD RAID.
Or you can go for the newer X4540 if you're feeling flush.
Regards, Iolaire
On 06/01/2010 22:35, Boris Epstein wrote:
Hello everyone,
This is not directly related to CentOS but still: we are trying to set up some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
Thanks.
Boris.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, Jan 6, 2010 at 10:35 PM, Boris Epstein borepstein@gmail.com wrote:
some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I'm kind of partial to Areca raid controllers, you can get up to 24 ports, so that can be as much as 20 TB (that's real world terabytes, not hardware manufacturer's) in a raid 6 with hot-spare using 1000 GB drives.
BR Bent
Bent Terp wrote:
On Wed, Jan 6, 2010 at 10:35 PM, Boris Epstein borepstein@gmail.com wrote:
some storage servers to run under Linux - most likely CentOS. The storage volume would be in the range specified: 8-15 TB. Any recommendations as far as hardware?
I'm kind of partial to Areca raid controllers, you can get up to 24 ports, so that can be as much as 20 TB (that's real world terabytes, not hardware manufacturer's) in a raid 6 with hot-spare using 1000 GB drives.
I see that the Areca driver has finally made it into the mainline Linux kernel. But I wonder how things have improved from this particular case.
http://notemagnet.blogspot.com/2008/08/linux-disk-failures-areca-is-not-so.h...
Any comments? With 3ware lately not looking so good from comments I have heard on the list over the past few years performance wise, I wonder how Adaptec and Areca look now?
On Tue, Jan 12, 2010 at 08:07:17AM +0800, Christopher Chan wrote:
I see that the Areca driver has finally made it into the mainline Linux kernel. But I wonder how things have improved from this particular case.
http://notemagnet.blogspot.com/2008/08/linux-disk-failures-areca-is-not-so.h...
I can't speak to this, except to point out that it is almost 18 months old, which is quite a long time in kernel development space.
With the right incantation, one can call smartctl directly on a drive connected to a 3ware controller, no matter what kind of array it is in. (I believe you can even call it on a drive assigned as a hot spare.)
Any comments? With 3ware lately not looking so good from comments I have heard on the list over the past few years performance wise, I wonder how Adaptec and Areca look now?
I've run an exclusively 3ware shop since I ditched my last aacraid machines a few years back. But with all their issues, I am definitely considering trying Areca on my next server that's not planned to be immediately mission-critical. (I wouldn't switch back to Adaptec unless I knew their interface tools, and especially their cli, had improved dramatically; the aaccli/afacli interfaces were simply atrocious.)
--keith
Keith Keller wrote:
On Tue, Jan 12, 2010 at 08:07:17AM +0800, Christopher Chan wrote:
I see that the Areca driver has finally made it into the mainline Linux kernel. But I wonder how things have improved from this particular case.
http://notemagnet.blogspot.com/2008/08/linux-disk-failures-areca-is-not-so.h...
I can't speak to this, except to point out that it is almost 18 months old, which is quite a long time in kernel development space.
Which is why I am asking.
With the right incantation, one can call smartctl directly on a drive connected to a 3ware controller, no matter what kind of array it is in. (I believe you can even call it on a drive assigned as a hot spare.)
Which is why I specifically said 'performance wise' as respects 3ware. I don't remember anything bad about 3ware stability wise or monitoring wise.
Any comments? With 3ware lately not looking so good from comments I have heard on the list over the past few years performance wise, I wonder how Adaptec and Areca look now?
I've run an exclusively 3ware shop since I ditched my last aacraid machines a few years back. But with all their issues, I am definitely considering trying Areca on my next server that's not planned to be immediately mission-critical. (I wouldn't switch back to Adaptec unless I knew their interface tools, and especially their cli, had improved dramatically; the aaccli/afacli interfaces were simply atrocious.)
What issues are you having with 3ware?
On Tuesday 12 January 2010, Christopher Chan wrote:
Keith Keller wrote:
On Tue, Jan 12, 2010 at 08:07:17AM +0800, Christopher Chan wrote:
I see that the Areca driver has finally made it into the mainline Linux kernel. But I wonder how things have improved from this particular case.
http://notemagnet.blogspot.com/2008/08/linux-disk-failures-areca-is-not- so.html
I can't speak to this, except to point out that it is almost 18 months old, which is quite a long time in kernel development space.
Which is why I am asking.
With the right incantation, one can call smartctl directly on a drive connected to a 3ware controller, no matter what kind of array it is in. (I believe you can even call it on a drive assigned as a hot spare.)
Which is why I specifically said 'performance wise' as respects 3ware. I don't remember anything bad about 3ware stability wise or monitoring wise.
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli? Enjoying the fact that "show diag" gives you a cyclic text buffer without references? etc.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
/Peter
Any comments? With 3ware lately not looking so good from comments I have heard on the list over the past few years performance wise, I wonder how Adaptec and Areca look now?
I've run an exclusively 3ware shop since I ditched my last aacraid machines a few years back. But with all their issues, I am definitely considering trying Areca on my next server that's not planned to be immediately mission-critical. (I wouldn't switch back to Adaptec unless I knew their interface tools, and especially their cli, had improved dramatically; the aaccli/afacli interfaces were simply atrocious.)
What issues are you having with 3ware?
Which is why I specifically said 'performance wise' as respects 3ware. I don't remember anything bad about 3ware stability wise or monitoring wise.
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli? Enjoying the fact that "show diag" gives you a cyclic text buffer without references? etc.
Oh, I did not hear of those and my last experience with 3ware was up to the 95xx series. I did hear of horror stories of Mylex but I myself never got to see one of those where the raid configuration would completely disappear. Most of my experience with 3ware is with the 75xx and 85xx cards which are only good for raid1+0 unless you can afford the major performance hit with raid5.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
On Tuesday 12 January 2010, Chan Chung Hang Christopher wrote:
Which is why I specifically said 'performance wise' as respects 3ware. I don't remember anything bad about 3ware stability wise or monitoring wise.
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli? Enjoying the fact that "show diag" gives you a cyclic text buffer without references? etc.
Oh, I did not hear of those and my last experience with 3ware was up to the 95xx series. I did hear of horror stories of Mylex but I myself never got to see one of those where the raid configuration would completely disappear. Most of my experience with 3ware is with the 75xx and 85xx cards which are only good for raid1+0 unless you can afford the major performance hit with raid5.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
/Peter
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
Which is why I specifically said 'performance wise' as respects 3ware. I don't remember anything bad about 3ware stability wise or monitoring wise.
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli? Enjoying the fact that "show diag" gives you a cyclic text buffer without references? etc.
Oh, I did not hear of those and my last experience with 3ware was up to the 95xx series. I did hear of horror stories of Mylex but I myself never got to see one of those where the raid configuration would completely disappear. Most of my experience with 3ware is with the 75xx and 85xx cards which are only good for raid1+0 unless you can afford the major performance hit with raid5.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1. That at least has the advantage of being able to recover the data from any single drive that might still work after a problem.
On Tuesday 12 January 2010, Les Mikesell wrote:
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
...
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
/Peter
That at least has the advantage of being able to recover the data from any single drive that might still work after a problem.
On 1/12/2010 6:05 PM, Peter Kjellstrom wrote:
...
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Silently as in it didn't yell at you or silently as in 'cat /proc/mdstat' showed them in sync but they weren't? I've only ever seen one of the latter and it was in a machine with bad RAM - don't think any disk controller could have helped with that.
On Wed, Jan 13, 2010 at 01:05:39AM +0100, Peter Kjellstrom wrote:
On Tuesday 12 January 2010, Les Mikesell wrote:
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
...
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
Not yet :)
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
I'm interested in this because I have never seen Linux software MD RAID1 failures like this, but some people keep telling they happen frequently..
I'm just wondering why I'm not seeing these failures, or if I've just been lucky so far..
-- Pasi
Pasi Kärkkäinen wrote:
On Wed, Jan 13, 2010 at 01:05:39AM +0100, Peter Kjellstrom wrote:
On Tuesday 12 January 2010, Les Mikesell wrote:
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
...
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
Not yet :)
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
I'm interested in this because I have never seen Linux software MD RAID1 failures like this, but some people keep telling they happen frequently..
It could be like Les said - bad RAM. I certainly have not encountered this sort of error on a md raid1 array.
I'm just wondering why I'm not seeing these failures, or if I've just been lucky so far..
Yeah, lucky you've not got bad RAM that passed POSTing and at the same time did not bring your system down on you right from the start or rendered it unstable.
Christopher Chan wrote:
Funny you should mention software RAID1... I've seen two instances of that
getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
Not yet :)
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
I'm interested in this because I have never seen Linux software MD RAID1 failures like this, but some people keep telling they happen frequently..
It could be like Les said - bad RAM. I certainly have not encountered this sort of error on a md raid1 array.
I'm just wondering why I'm not seeing these failures, or if I've just been lucky so far..
Yeah, lucky you've not got bad RAM that passed POSTing and at the same time did not bring your system down on you right from the start or rendered it unstable.
On the machine where I had the problem I had to run memtest86 more than a day to finally catch it. Then after replacing the RAM and fsck'ing the volume, I still had mysterious problems about once a month until I realized that the disks are accessed alternately and the fsck pass didn't catch everything. I forget the commands to compare and fix the mirroring, but they worked - and I think the centos 5.4 update does that periodically as a cron job now. The other worry is that when one drive dies, you might have unreadable spots in normally unused areas of the mirror since this will keep a rebuild from working - but the cron job should detect those too if you notice the results.
On the machine where I had the problem I had to run memtest86 more than a day to finally catch it. Then after replacing the RAM and fsck'ing the volume, I still had mysterious problems about once a month until I realized that the disks are accessed alternately and the fsck pass didn't catch everything. I forget the commands to compare and fix the mirroring, but they worked - and I think the centos 5.4 update does that periodically as a cron job now. The other worry is that when one drive dies, you might have unreadable spots in normally unused areas of the mirror since this will keep a rebuild from working - but the cron job should detect those too if you notice the results.
I am going to take a good look at the cron jobs on the moodle box then. Need to check whether the ubuntu box does the same. Man, if only I had a Centos cd when the previous gateway died...
On Wednesday 13 January 2010, Pasi Kärkkäinen wrote:
On Wed, Jan 13, 2010 at 01:05:39AM +0100, Peter Kjellstrom wrote:
On Tuesday 12 January 2010, Les Mikesell wrote:
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
...
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
Not yet :)
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
Both of my data-points are several years old so most of the details are lost in the fog-of-lost-memories...
Both were on desktop class hardware with onboard IDE or SATA. If I remember correctly one was on CentOS(4?) and one was on either an old Ubuntu or a classic debian (atleast we're talking 2.6 kernels).
My main point was that, nope, linux-md is not the holy grail either.
The only storage products that I've not had fail me tend to be either: 1) Those that are too new (give them time) 2) Those that I havn't tried (in scale) yet (which always gives a strong "the grass is greener on the other side feeling")
/Peter
I'm interested in this because I have never seen Linux software MD RAID1 failures like this, but some people keep telling they happen frequently..
I'm just wondering why I'm not seeing these failures, or if I've just been lucky so far..
-- Pasi
On Wed, Jan 13, 2010 at 11:43:35AM +0100, Peter Kjellstrom wrote:
On Wednesday 13 January 2010, Pasi Kärkkäinen wrote:
On Wed, Jan 13, 2010 at 01:05:39AM +0100, Peter Kjellstrom wrote:
On Tuesday 12 January 2010, Les Mikesell wrote:
On 1/12/2010 10:39 AM, Peter Kjellstrom wrote:
...
> ...that said, it's not much worse than the competetion, storage > simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Nope, storage sucks, that includes the software ;-)
If you can split the storage up into 2TB or smaller volumes that you can mount into sensible locations to spread the load and avoid contention you can always use software RAID1.
Funny you should mention software RAID1... I've seen two instances of that getting silently out-of-sync and royally screwing things up beyond all repair.
Maybe this thread has gone on long enough now?
Not yet :)
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
Both of my data-points are several years old so most of the details are lost in the fog-of-lost-memories...
Ok.. too bad.
Both were on desktop class hardware with onboard IDE or SATA. If I remember correctly one was on CentOS(4?) and one was on either an old Ubuntu or a classic debian (atleast we're talking 2.6 kernels).
My main point was that, nope, linux-md is not the holy grail either.
The only storage products that I've not had fail me tend to be either:
- Those that are too new (give them time)
- Those that I havn't tried (in scale) yet (which always gives a strong "the
grass is greener on the other side feeling")
Yep :)
-- Pasi
Peter Kjellstrom wrote:
Please tell more about your hardware and software. What distro? What kernel? What disk controller? What disks?
Both of my data-points are several years old so most of the details are lost in the fog-of-lost-memories...
Both were on desktop class hardware with onboard IDE or SATA. If I remember correctly one was on CentOS(4?) and one was on either an old Ubuntu or a classic debian (atleast we're talking 2.6 kernels).
My main point was that, nope, linux-md is not the holy grail either.
But it does have the advantage of not adding _extra_ things to break. If your CPU/RAM/disk controller fail you are pretty much dead anyway, and with md you can move the disks/array to a new system without having to match the exact controller. With md raid1 you can access any single disk directly if that's all that still works.
The only storage products that I've not had fail me tend to be either:
- Those that are too new (give them time)
- Those that I havn't tried (in scale) yet (which always gives a strong "the
grass is greener on the other side feeling")
Everything breaks eventually (or has fire/flood/earthquake damage). Backups and redundancy are the only solution to that.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
So you are saying people dole out huge amounts of money for rubbish? That the software raid people were and have always been right?
Depends what the software raid people were saying. :)
Hardware & Software RAID have their places. We use HW RAID exclusively in our shop (SMB w/ multiple sites and I'm the sole tech.) but that's because I can't always be on-site when drives die so it's nice to know I can be out of town for a few days and if a drive packs it in the RAID system in our IBM servers is plug & play. I get the alert, call our vendor, and he shows up 9am next morning with a replacement drive.
Am 12.01.2010 09:01, schrieb Peter Kjellstrom:
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli? Enjoying the fact that "show diag" gives you a cyclic text buffer without references? etc.
...that said, it's not much worse than the competetion, storage simply sucks ;-(
Which is probably the reason why the ZFS-folks are trying to move as much intelligence out of the HBA into the OS.
Rainer
On 12/01/10 12:22, Rainer Duffner wrote:
Which is probably the reason why the ZFS-folks are trying to move as much intelligence out of the HBA into the OS.
not something that is really working - given that I've seen centos stock with a few hba's easily our perform raid-z - with better reliability and lower latency.
On Tue, Jan 12, 2010 at 09:01:42AM +0100, Peter Kjellstrom wrote:
Is that supposed to be a joke? 3ware has certainly had their fair share of stability problems (drive time-outs, bbu-problems, inconsistent behaviour, ...) and monitoring wise they suck (imho). Do you like tw_cli?
I don't dislike tw_cli. At least, I like it much better than the old afacli/aaccli, and I also like having 3dm2 send out email alerts in addition to monitoring using tw_cli. (I have a short nagios plugin I've written to parse the tw_cli output, as well; it's not all that great, but if others are interested I will send it offlist.)
I have seen some of the other issues you mentioned, which is one reason I have been considering other vendors (and, I suppose, software RAID, though less seriously).
--keith