On Sat, Jun 2, 2012 at 1:31 PM, Boris Epstein borepstein@gmail.com wrote:
Anything that needs atomic operations is difficult to scale. Throw in distributed components and an extra user/kernel layer and there are lots of ways to go wrong.
Les, what doesn't need atomic operations?
That's the driving force behind 'nosql' databases. Riak, for example allows concurrent conflicting operations on a potentially partitioned cluster and figures it out after the fact. But in anything resembling a filesystem the directory operations have to be atomic. If you open a file, you need to know if that name already exists, and no matter how many processes try to create a new file or link, only one can succeed. So, you pretty much have to lock all directory operations while any of them complete. Do you happen to have a lot of files in a single directory?
And how doing things in kernel makes your program more scalable - it is the algorithm that matters, not the execution space, IMO.
It is hard enough to do atomic operations and locks at a single layer - it becomes next to impossible when distributed, and adding layers with different scheduling concepts has to make it worse. The kernel has its own way of locking, but anything in userland is going to need a trip into the kernel and back just for that.