Hello,
Over the weekend, I upgraded one of my servers that runs mysql and pdns to 5.3. Previous to the update I have not had an issue form this server. But since, I have had mysql die multiple times from oom-killer.
$ uname -a Linux rack2a 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Unfortunately, the server has been rebooted by others before I have been able to look at it while the problem is occurring. But I have found this in the logs:
http://pastebin.centos.org/25553
Running sar -A:
http://pastebin.centos.org/25556
Since the server was rebooted, the amount of swap is used is 0.
How do I determine what process is/was chewing up the memory on this server? What should I be looking for to narrow this down?
Thanks, Rick
Rick Barnes wrote:
How do I determine what process is/was chewing up the memory on this server? What should I be looking for to narrow this down?
Setup a script to monitor memory usage and alert you when swap usage starts getting high. Or setup a monitor that just runs ps auxw or something and sends output to a file so the next time it happens you know how much memory everything was using at the time.
nate
On Mon, 2009-04-13 at 10:31 -0400, Rick Barnes wrote:
Hello,
Over the weekend, I upgraded one of my servers that runs mysql and pdns to 5.3. Previous to the update I have not had an issue form this server. But since, I have had mysql die multiple times from oom-killer.
$ uname -a Linux rack2a 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Unfortunately, the server has been rebooted by others before I have been able to look at it while the problem is occurring. But I have found this in the logs:
http://pastebin.centos.org/25553
Running sar -A:
http://pastebin.centos.org/25556
Since the server was rebooted, the amount of swap is used is 0.
How do I determine what process is/was chewing up the memory on this server? What should I be looking for to narrow this down?
Thanks, Rick
--- You found your problem already. MYSQL. There could be something helping it also. When SQL server begins running out of RAM there could be many things to blame on it doing so. What actually does mysql do? As in what do you use it for. Try ps –auxf | grep mysqld. Something else you can use is "mytop". You can use the below as a script and it works in an infinite loop or just use the command only. What you need to see is the running queries. ################## #!/bin/bash while [ 1 ] do mysql -N -u root -ppassword -e 'show processlist' |grep -v 'show processlist' sleep 2 done ##################
JohnStanley
On Mon, 2009-04-13 at 10:31 -0400, Rick Barnes wrote:
Hello,
I have had mysql die multiple times from oom-killer.
Thanks, Rick
--- In seeing what you do I would be looking at the queries being made. SPROCs, Triggers, Views. Also very import is if that one particular data base is doing graffing "running predictions" as you all do GIS and Mapping. Predictions can be huge memory burners. Last thing does it run as a virtual machine or on real hardware?
JohnStanley
JohnS wrote:
On Mon, 2009-04-13 at 10:31 -0400, Rick Barnes wrote:
Hello,
I have had mysql die multiple times from oom-killer.
Thanks, Rick
In seeing what you do I would be looking at the queries being made. SPROCs, Triggers, Views. Also very import is if that one particular data base is doing graffing "running predictions" as you all do GIS and Mapping. Predictions can be huge memory burners. Last thing does it run as a virtual machine or on real hardware?
This is a real hardware server. It is not doing GIS or mapping of the like, most of the DBs on this are for web sites, mostly form data storage, some CMS DBs for joomla, WP, etc. also some others for cacti and powerdns backends.
I have setup some scripts to watch the swap usage to see if I can track this down.
Thanks for the help
Rick
On Mon, 2009-04-13 at 14:18 -0400, Rick Barnes wrote:
This is a real hardware server. It is not doing GIS or mapping of the like, most of the DBs on this are for web sites, mostly form data storage, some CMS DBs for joomla, WP, etc. also some others for cacti and powerdns backends.
I have setup some scripts to watch the swap usage to see if I can track this down.
Thanks for the help
Rick
--- Seeing as you said you upgraded from 5.2 - 5.3 I would be looking at the kernel release notes and the mysql release notes for known problems since you did not have prior problems. I would check out the Cacti and DNS Databases because there more realtime in nature to running on the server than the content ones. Using the script I posted will catch the offending query. I myself would take a hard look @ MYSQL itself. There is a huge debate about it not being Production Ready. Last option would be to do a yum --allow-downgrade until it's sorted out on a test machine.
JohnStanley
JohnS wrote:
Seeing as you said you upgraded from 5.2 - 5.3 I would be looking at the kernel release notes and the mysql release notes for known problems since you did not have prior problems. I would check out the Cacti and DNS Databases because there more realtime in nature to running on the server than the content ones. Using the script I posted will catch the offending query. I myself would take a hard look @ MYSQL itself. There is a huge debate about it not being Production Ready. Last option would be to do a yum --allow-downgrade until it's sorted out on a test machine.
It appears as though apache is to blame:
http://pastebin.centos.org/25568
By stopping and starting apache, %swpused went from 92.84% to 6.41% and has remained for about an hour now.
Thanks, Rick
On Mon, 2009-04-13 at 15:25 -0400, Rick Barnes wrote:
JohnS wrote:
Seeing as you said you upgraded from 5.2 - 5.3 I would be looking at the kernel release notes and the mysql release notes for known problems since you did not have prior problems. I would check out the Cacti and DNS Databases because there more realtime in nature to running on the server than the content ones. Using the script I posted will catch the offending query. I myself would take a hard look @ MYSQL itself. There is a huge debate about it not being Production Ready. Last option would be to do a yum --allow-downgrade until it's sorted out on a test machine.
It appears as though apache is to blame:
http://pastebin.centos.org/25568
By stopping and starting apache, %swpused went from 92.84% to 6.41% and has remained for about an hour now.
Thanks, Rick
--- Now you get to nail down the offending site/application. By the way that's a lot of ram for apache to eat up. I may be wrong but didn't MYSQL show eating all that RAM also? Keep in mind that bad sql queries will also make a web server eat ram. Totally dependent on your situation.
JohnStanley