On Fri, Jul 15, 2011 at 12:35 AM, Don Krause dkrause@optivus.com wrote:
On Jul 14, 2011, at 12:56 PM, Pasi Kärkkäinen wrote:
On Thu, Jul 14, 2011 at 04:53:11PM +0300, Pasi Kärkkäinen wrote:
On Wed, Jul 13, 2011 at 11:32:14PM -0700, John R Pierce wrote:
I've been asked for ideas on building a rather large archival storage system for inhouse use, on the order of 100-400TB. Probably using CentOS 6. The existing system this would replace is using Solaris 10 and ZFS, but I want to explore using Linux instead.
We have our own tomcat based archiving software that would run on this storage server, along with NFS client and server. Its a write once, read almost never kind of application, storing compressed batches of archive files for a year or two. 400TB written over 2 years translates to about 200TB/year or about 7MB/second average write speed. The very rare and occasional read accesses are done by batches where a client makes a webservice call to get a specific set of files, then they are pushed as a batch to staging storage where the user can then browse them, this can take minutes without any problems.
My general idea is a 2U server with 1-4 SAS cards connected to strings of about 48 SATA disks (4 x 12 or 3 x 16), all configured as JBOD, so there would potentially be 48 or 96 or 192 drives on this one server. I'm thinking they should be laid as as 4 or 8 or 16 seperate RAID6 sets of 10 disks each, then use LVM to put those into a larger volume. About 10% of the disks would be reserved as global hot spares.
So, my questions...
D) anything important I've neglected?
Remember Solaris ZFS does checksumming for all data, so with weekly/monthly ZFS scrubbing it can detect silent data/disk corruption automatically and fix it. With a lot of data, that might get pretty important..
Oh, and one more thing.. if you're going to use that many JBODs, pay attention to SES chassis management chips/drivers and software, so that you get the error/fault LEDs working on disk failure!
-- Pasi
And make sure the assembler wires it all up correctly, I have a JBOD box, 16 drives in a supermicro chassis, where the drives are numbered left to right, but the error lights assume top to bottom.
The first time we had a drive fail, I opened the RAID management software, clicked "Blink Light" on the failed drive, pulled the unit that was flashing, and toasted the array. (Of course, NOW it's RAID6 with hot spare so that won't happen anymore..)
-- Don Krause "This message represents the official view of the voices in my head."
Which is why nobody should use RAID5 for anything other than test purposes :)