[CentOS] really large file systems with centos

Fri Jul 15 07:06:05 UTC 2011
Rudi Ahlers <Rudi at SoftDux.com>

On Fri, Jul 15, 2011 at 12:35 AM, Don Krause <dkrause at optivus.com> wrote:
> On Jul 14, 2011, at 12:56 PM, Pasi Kärkkäinen wrote:
>
>> On Thu, Jul 14, 2011 at 04:53:11PM +0300, Pasi Kärkkäinen wrote:
>>> On Wed, Jul 13, 2011 at 11:32:14PM -0700, John R Pierce wrote:
>>>> I've been asked for ideas on building a rather large archival storage
>>>> system for inhouse use, on the order of 100-400TB. Probably using CentOS
>>>> 6.    The existing system this would replace is using Solaris 10 and
>>>> ZFS, but I want to explore using Linux instead.
>>>>
>>>> We have our own tomcat based archiving software that would run on this
>>>> storage server, along with NFS client and server.   Its a write once,
>>>> read almost never kind of application, storing compressed batches of
>>>> archive files for a year or two.   400TB written over 2 years translates
>>>> to about 200TB/year or about 7MB/second average write speed.   The very
>>>> rare and occasional read accesses are done by batches where a client
>>>> makes a webservice call to get a specific set of files, then they are
>>>> pushed as a batch to staging storage where the user can then browse
>>>> them, this can take minutes without any problems.
>>>>
>>>> My general idea is a 2U server with 1-4 SAS cards connected to strings
>>>> of about 48 SATA disks (4 x 12 or 3 x 16), all configured as JBOD, so
>>>> there would potentially be 48 or 96 or 192 drives on this one server.
>>>> I'm thinking they should be laid as as 4 or 8 or 16 seperate RAID6 sets
>>>> of 10 disks each, then use LVM to put those into a larger volume.
>>>> About 10% of the disks would be reserved as global hot spares.
>>>>
>>>> So, my questions...
>>>>
>>>> D) anything important I've neglected?
>>>>
>>>
>>> Remember Solaris ZFS does checksumming for all data, so with weekly/monthly ZFS scrubbing it can detect silent data/disk corruption automatically and fix it. With a lot of data, that might get pretty important..
>>>
>>
>> Oh, and one more thing.. if you're going to use that many JBODs,
>> pay attention to SES chassis management chips/drivers and software,
>> so that you get the error/fault LEDs working on disk failure!
>>
>> -- Pasi
>
>
> And make sure the assembler wires it all up correctly, I have a JBOD box, 16 drives in a supermicro chassis,
> where the drives are numbered left to right, but the error lights assume top to bottom.
>
> The first time we had a drive fail, I opened the RAID management software, clicked "Blink Light" on the failed drive,
> pulled the unit that was flashing, and toasted the array. (Of course, NOW it's RAID6 with hot spare so that won't happen anymore..)
>
> --
> Don Krause
> "This message represents the official view of the voices in my head."



Which is why nobody should use RAID5 for anything other than test purposes :)



-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532