On Tuesday, January 04, 2011 09:14:57 am Adam Tauno Williams wrote:
On Tue, 2011-01-04 at 15:06 +0100, Dominik Zyla wrote:
Many people care about storage format.
And they are misguided in doing so. Details of message storage is an internal [server's] problem.
Hmmm, not quite.
When selecting the file system on which to store e-mail, the storage format is significant; it's a 'small number of large files' versus 'large number of small files' issue then, and filesystems differ in their performance between them. Some filesystems slow down with large maildirs; some slow with large mboxes. If you support a hundred or a thousand users, make sure you allocate enough inodes on that mailstore filesystem if you use maildir. For POP-only servers mbox works fine. For IMAP servers where IMAP is the primary access means, not so fine.
In my opinion, maildirs are great for rapidly changing dynamic folders, like the inbox, whereas mboxes are wonderful for archives, where they tend to take less disk space for the same number of messages, and tend to change more slowly. And when you have folders containing hundreds of thousands of e-mails (yes, hundreds of thousands, in one particular archive, I have) where the individual e-mails are quite short, the difference adds up.
In my case, our primary e-mail server is Scalix, so that dictated the storage format. But, honestly, I personally would love to use a PostgreSQL backend so that real concurrent access is possible; I have users with Scalix mail folders that take a long time to rsync simply due to the number of messages (25-30 thousand in the inbox, and they are 'folder clueless' and don't want to throw anything away), and in order to get a consistent backup scalix has to be shut down during the rsync (even if the folder hasn't changed, rsync still has to read all those directory entries, which takes time); an ACID database backend (PostgreSQL, MySQL InnoDB, Oracle, etc) will allow a fully consistent backup to be taken while the database is active. And backup tools for such databases are very mature.
Scalix 11 uses PostgreSQL, but not as the primary mailstore.