[CentOS] Effectiveness of CentOS vm.swappiness
Gordon Messmer
gordon.messmer at gmail.com
Fri Jun 5 21:32:54 UTC 2015
On 06/05/2015 12:09 PM, Markus "Shorty" Uckelmann wrote:
> Am 05.06.2015 um 18:33 schrieb Gordon Messmer:
>> On 06/05/2015 03:29 AM, Markus "Shorty" Uckelmann wrote:
>>> some (probably unused) parts are swapped out. But, some of
>>> those parts are the salt-minion, php-fpm or mysqld. All services which
>>> are important for us and which suffer badly from being swapped out.
>>
>> Those two things can't really both be true. If the pages swapped out
>> are unused, then the application won't suffer as a result.
>
> Why not? If you have an application which sees action only every 12 to
> 24 hours,I think this can happen.
Well, that's not "unused," then.
To measure the swap use of your processes, install "smem". It will show
you the amount of swap that each process is using.
For more specific information, make a copy of /proc/<pid>/smaps.
To quantify your problem, let bacula run then save the output of smem,
or /proc/<pid>/smaps for each of your critical services, or both, and
then access each of the services and quantify the latency relative to
the normal latency. Finally, after collecting latency information, get
the output of smem and/or /proc/<pid>/smaps again. You can compare swap
use before and after accessing the service to see how much was swapped
out beforehand (presumably because of the backup), and how much had to
be recovered for your test query.
I'd suggest collecting that information at the normal swappiness setting
and at 0.
If the kernel is swapping out processes in favor of filesystem cache
when swappiness is 0, I believe that would be a bug, and should be
reported to the kernel developers.
> Our salt-minion would be a candidate
> for this. Allthough we constantly check if it's alive, we only do once
> or twice a day something "heavy" like a deployment. And very often we
> have to run thos deployments twice, because the first time we get a lot
> of timeouts. Sure, it might be the software itself. But I think it could
> be possible that it is because of swapped out pages.
"Timeouts" is pretty vague. Very generally, it's possible that you have
a timeout configured somewhere that is failing on the first run because
the filesystem cache now contains content from your backup, and your
process only completes in time when the files needed for the deployment
are in the filesystem cache. That's a stretch as far as explanations
go, but if that is the case, then swappiness isn't going to fix the
problem. You need to fix your timeout so that it allows enough time for
the deployment to finish when the server is cold booted (using no
cache), or prime your caches before doing deployments.
More information about the CentOS
mailing list