[CentOS] strange memory issues with CentOS 6.2 on VPS

Tue Feb 28 22:01:31 UTC 2012
Tomas Vondra <tv at fuzzy.cz>

On 27.2.2012 12:57, Tomas Vondra wrote:
> On 27 Únor 2012, 11:26, Peter Kjellström wrote:
>> On Sunday 26 February 2012 19.59.07 Tomas Vondra wrote:
>> ...
>>> i.e. about 200 MB of free memory, but apache fails because of segfaults
>>> when forking a child process:
>>>
>>>   [16:49:51 2012] [error] (12)Cannot allocate memory: fork: Unable to
>>>                           fork new process
>>>   [16:51:17 2012] [notice] child pid 2577 exit signal Segmentation
>>>                            fault (11)
>>
>> In general things can get quite bad with relatively high memory pressure
>> and
>> no swap.
> 
> Sure, but there's no such pressure. There was almost 200MB of "free"
> memory (used for page cache, not dirty thus easy to drop).
> 
>> That said, one thing that comes to mind is stacksize. When forking the
>> linux
>> kernel needs whatever the current stacksize is to be available as (free +
>> free
>> swap).
>>
>> Also, just because you see Y bytes free doesn't mean you can successfully
>> malloc that much (fragmentation, memory zones, etc.).
> 
> Yup, I'm aware of that. But it's rather improbable, especially given the
> other symptoms.
> 
> Update: After submitting the original post, I've noticed that these issues
> probably started about a week ago after upgrading a kernel and several
> related packages. I've had a swap there and the issues were not as severe,
> so I haven't noticed that before. I do remember I got an OOM error during
> that upgrade and I thought I've dealt with it properly, but maybe not. So
> I've reinstalled (remove+install) all those packages, rebooted and the
> problems disappeared. I will check that in the evening, but hopefully it's
> fixed.

Well, I've found the actual issue. It clearly was my stupidity as I was
messing with overcommit_memory without fully understanding it.

What I did was that I set (as mentioned in the original post)

   vm.overcommit_memory = 2

which limits the amount of available memory to

   swap + vm.overcommit_ratio * RAM

where vm.overcommit_ratio=50 by default, so you can allocate swap + 1/2
the physical memory. This is just fine if you have a swap - for example
if you have swap size equal to RAM, this means 150% of RAM is available
for processes.

The issues start when you disable swap (as I did) - then it effectively
limits the available memory to 50% of physical RAM (and receive OOM if
you try to allocate more. This is exactly what happened to me :-(

So what I did was that I set

   vm.overcommit_ratio = 100

which gives me 100% of RAM. I know this will give me an OOM if I use all
the physical RAM, but that's expected - I don't want to use swap on a
virtual machine with poor I/O (and the services are set accordingly).

So the moral is don't mess with something you don't fully understand.

kind regards
Tomas