[CentOS] unfsd scalability issues

Sun Jun 17 20:32:44 UTC 2012
Boris Epstein <borepstein at gmail.com>

On Thu, Jun 14, 2012 at 1:15 PM, Boris Epstein <borepstein at gmail.com> wrote:

>
>
> On Wed, Jun 13, 2012 at 10:11 AM, <m.roth at 5-cent.us> wrote:
>
>> Boris Epstein wrote:
>> > On Sat, Jun 2, 2012 at 2:50 PM, John R. Dennison <jrd at gerdesas.com>
>> wrote:
>> >> On Sat, Jun 02, 2012 at 10:59:13AM -0400, Boris Epstein wrote:
>> <snip>
>> > To be specific, I use UNFSD to export a MooseFS file system. MooseFS, by
>> > the way, is userland-process based too.
>> >
>> > Be that as it may, I've seen situations where a comparably configured
>> > MooseFS client get to read at, say, 40 MB/s - which is fine - but the
>> > UNFSD at the same time reads at 40K/s(!) Why would that be? I mean, some
>> > degradation I can dig but 3 orders of magnitude? What is with this? Am I
>> > doing something wrong?
>> <snip>
>> I wonder... what's the architecture of what you're getting these results?
>> I tried opening a bug with upstream over NFS4 and 6.x, and no one ever
>> looked at it, and they closed it.
>>
>> 100% repeatably: unpack a package locally, seconds.
>>                 unpack it from an NFS mount onto a local drive, about 1
>> min.
>>                 unpack it from an NFS mount onto an NFS mount, even when
>>                    the target is exported FROM THE SAME MACHINE* that the
>>                    process is running on: 6.5 - 7 MINUTES.
>>
>> * That is,
>>     [server 1]                             [server 2]
>>        /export/thatdir --NFS-->    /target/dir
>>                                    /s2/source
>>                                    /source/dir --NFS-->/s2/source
>>     and cd [server 2]:/target/dir and unpack from /s2/source
>>
>> I suppose I'll try logging into upstream's bugzilla using our official
>> licensed id; maybe then they'll assign someone to look at it....
>>
>>        mark
>>
>>
>>
> Mark,
>
> Thanks, my architecture is extremely similar to yours, except that in my
> case the "second layer", if I may say so, is MooseFS (
> http://www.moosefs.org/ ), not NFS. MooseFS itself is blazing, by the way.
>
> So the diagram in my case would look something like this:
>
>        /export/thatdir --NFS-->    /target/dir
>                                    /s2/source
>                                    /source/dir -- MooseFS mount (mfsmount)
> -->/s2/source
>
> The discrepancy in the resultant performance is comparable.
>
> Thanks.
>
> Boris.
>

I may have discovered a fix. Still don't know why it is a fix - but for
what it's worth...

OK, if you put your UNFSD daemon on a completely different physical machine
- i.e., with no MooseFS component running on it - it seems to work just
fine. For a single client I got a performance of about 70 MB/s over 1
Gbit/s network. When multiple (up to 5) clients) do their reads the
performance seems to degrade roughly proportionally.

And this is strange. I've got MooseFS currently confined to just one
machine (8 cores, 48 GB RAM): master server, meta server, chunk server, the
whole thing. And that works fine. Add UNFSD - and it still works, and the
load is still low (under 1) - and yet the UNFSD's performance goes down the
drain. Why? I have no idea.

By the way, the autonomous UNFSD server is far from a powerful piece of
software - all it is is a P5-class 2-core machine with 2 GB of RAM. So go
figure...

Boris.