[CentOS] cyrus spool on btrfs?

Fri Sep 8 17:31:54 UTC 2017
hw <hw at gc-24.de>

Mark Haney wrote:
> On 09/08/2017 09:49 AM, hw wrote:
>> Mark Haney wrote:
>>> I hate top posting, but since you've got two items I want to comment on, I'll suck it up for now.
>> I do, too, yet sometimes it´s reasonable.  I also hate it when the lines
>> are too long :)
> I'm afraid you'll have to live with it a bit longer.  Sorry.
>>> Having SSDs alone will give you great performance regardless of filesystem.
>> It depends, i. e. I can´t tell how these SSDs would behave if large amounts of
>> data would be written and/or read to/from them over extended periods of time because
>> I haven´t tested that.  That isn´t the application, anyway.
> If your I/O is going to be heavy (and you've not mentioned expected traffic, so we can only go on what little we glean from your posts), then SSDs will likely start having issues sooner than a mechanical drive might.  (Though, YMMV.)  As I've said, we process 600 million messages a month, on primary SSDs in a VMWare cluster, with mechanical storage for older, archived user mail.  Archived, may not be exactly correct, but the context should be clear.

I/O is not heavy in that sense, that´s why I said that´s not the application.
There is I/O which, as tests have shown, benefits greatly from low latency, which
is where the idea to use SSDs for the relevant data has arisen from.  This I/O
only involves a small amount of data and is not sustained over long periods of time.
What exactly the problem is with the application being slow with spinning disks is
unknown because I don´t have the sources, and the maker of the application refuses
to deal with the problem entirely.

Since the data requiring low latency will occupy about 5% of the available space on
the SSDs and since they are large enough to hold the mail spool for about 10 years at
its current rate of growth besides that data, these SSDs could be well used to hold
that mail spool.

>>> BTRFS isn't going to impact I/O any more significantly than, say, XFS.
>> But mdadm does, the impact is severe.  I know there are ppl saying otherwise,
>> but I´ve seen the impact myself, and I definitely don´t want it on that
>> particular server because it would likely interfere with other services.  I don´t
>> know if the software RAID of btrfs is better in that or not, though, but I´m
>> seeing btrfs on SSDs being fast, and testing with the particular application has
>> shown a speedup of factor 20--30.
> I never said anything about MD RAID.  I trust that about as far as I could throw it.  And having had 5 surgeries on my throwing shoulder wouldn't be far.

How else would I create a RAID with these SSDs?

I´ve been using md-RAID for years, and it always worked fine.

>> That is the crucial improvement.  If the hardware RAID delivers that, I´ll use
>> that and probably remove the SSDs from the machine as it wouldn´t even make sense
>> to put temporary data onto them because that would involve software RAID.
> Again, if the idea is to have fast primary storage, there are pretty large SSDs available now and I've hardware RAIDED SSDs before without trouble, though not for any heavy lifting, it's my test servers at home. Without an idea of the expected mail traffic, this is all speculation.

The SSDs don´t need to be large, and they aren´t.  They are already greatly oversized at
512GB nominal capacity.

There´s only a few hundred emails per day.  There is no special requirement for their
storage, but there is a lot of free space on these SSDs, and since the email traffic is
mostly read-only, it won´t wear out the SSDs.  It simply would make sense to put the
mail spool onto these SSDs.

>>> It does have serious stability/data integrity issues that XFS doesn't have.  There's no reason not to use SSDs for storage of immediate data and mechanical drives for archival data storage.
>>> As for VMs we run a huge Zimbra cluster in VMs on VPC with large primary SSD volumes and even larger (and slower) secondary volumes for archived mail.  It's all CentOS 6 and works very well.  We process 600 million emails a month on that virtual cluster.  All EXT4 inside LVM.
>> Do you use hardware RAID with SSDs?
> We do not here where I work, but that was setup LONG before I arrived.

Probably with the very expensive SSDs suited for this ...

>>> I can't tell you what to do, but it seems to me you're viewing your setup from a narrow SSD/BTRFS standpoint.  Lots of ways to skin that cat.
>> That´s because I do not store data on a single disk, without redundancy, and
>> the SSDs I have are not suitable for hardware RAID.  So what else is there but
>> either md-RAID or btrfs when I do not want to use ZFS?  I also do not want to
>> use md-RAID, hence only btrfs remains.  I also like to use sub-volumes, though
>> that isn´t a requirement (because I can use directories instead and loose the
>> ability to make snapshots).
> If the SSDs you have aren't suitable for hardware RAID, then they aren't good for production level mail spools, IMHO.  I mean, you're talking like you're expecting a metric buttload of mail traffic, so it stands to reason you'll need really beefy hardware.  I don't think you can do what you seem to need on budget hardware. Personally, and solely based on this thread alone, if I was building this in-house, I'd get a decent server cluster together and build a FC or iSCSI SAN to a Nimble storage array with Flash/SSD front ends and large HDDs in the back end.  This solves virtually all your problems.  The servers will have tiny SSD boot drives (which I prefer over booting from the SAN) and then everything else gets handled by the storage back-end.

If SSDs not suitable for RAID usage aren´t suitable for production use, then basically
all SSDs not suitable for RAID usage are SSDs that can´t be used for anything that
requires something less volatile than a ramdisk.  Experience with such SSDs contradicts
this so far.

There is no "storage backend" but a file server, which, instead of 99.95% idling, is
being asisgned additional tasks, and since it is difficult to put a cyrus mail spool
on remote storage, the email server is one of these tasks.

> In effect this is how our mail servers are setup here.  And they are virtual.

You have entirely different requirements.

>> I stay away from LVM because that just sucks.  It wouldn´t even have any advantage
>> in this case.
> LVM is a joke.  It's always been something I've avoided like the plague.

I´ve also avoided it until I had an application where it would have been advantageous
if it actually provided the benefits it seems supposed to provide.  It turned out that
it didn´t and only made things much worse, and I continue to stay away from it.

After all, you´re saying it´s a bad idea to use these SSDs, especially with btrfs.
I don´t feel good about it, either, and I´ll try to avoid using them.