Yesterday our website was acting a little odd, and i received a few email complaints. Running 'top' showed that we had a very high load average, today the load average is even higher!
load average: 13.18, 12.49, 10.48
It is pretty unlikely that our site grew in popularity that quickly, and taking a look at the log file it was saying that a certain directory was not found, when in fact it was just a mod-re-write rule and was not supposed to exist. The website works fine... i am stumped and the error log is getting quite big! Does anyone have any tips on how to find the problem and address it accordingly (quickly)? Many thanks
On Tue, Dec 12, 2006 at 09:36:08AM -0800, Administrator said:
Yesterday our website was acting a little odd, and i received a few email complaints. Running 'top' showed that we had a very high load average, today the load average is even higher!
load average: 13.18, 12.49, 10.48
It is pretty unlikely that our site grew in popularity that quickly, and taking a look at the log file it was saying that a certain directory was not found, when in fact it was just a mod-re-write rule and was not supposed to exist. The website works fine... i am stumped and the error log is getting quite big! Does anyone have any tips on how to find the problem and address it accordingly (quickly)? Many thanks
Patient: Doctor, I am ill, but I can't tell you any details. What can I do to feel better?
Doctor: Well, it would be helpful if you give me some more details, or I could ask you about 10,000 questions and we can be here all month...
Seriously, please read: http://linuxmafia.com/faq/Essays/smart-questions.html
Then look at your log files. The contents will probably give you your answer anyway.
Thanks.
Wow, I apologize for being vague and demanding, I have an earnest desire to learn, and I hope that my learning will benefit others.
Let me try to approach this problem again:
I was hoping that someone could point me in the direction of some helpful ways to analyze this (Centos 4.3) system. It was acting very strangely yesterday. I received emails complaining that: - none of the links were working (it was a vague complaint) - the javascript version of our RSS feed had a 14 second load time ( http://www.insurancejournal.com/newsfeed/feed.php )
I have never had to track down a problem like this and i have spent a good deal of the morning looking into options for the 'top' command, (because i had read in a Novel book that it was useful for diagnosing bottlenecks). I saw that there were a lot of mysql and httpd processes going, but i figured that was just lots of people visiting the site, or editors working on stories.
After not being able to find anything helpful (that i could take action on), I ended up trying the vmstat command (got the idea here: http://www.cyberciti.biz/tips/linux-resource-utilization-to-detect-system-bo...).
The websites are set up to have their own log files, which are quite numerous so I ran the command:
find . -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort |less
from the logfiles directory to see which one has been modified last, then I looked at the most recent logfile with a 'tail -f' and saw that it was being written incredibly fast.
the repeating error was: File does not exist: /var/www/www.insurancejournal.com/web/news
it was my understanding that this was not a directory and it was just done through some sort of URL re-write.
Continuing to try to trouble shoot the issue i created the news directory. The error then changed to: File does not exist: /var/www/www.insurancejournal.com/web/news/east File does not exist: /var/www/www.insurancejournal.com/web/news/west and so on. Part of the reason for re-writing URLS (at least one reason I like it) it that you can make one PHP file look like a ton of HTML files. I could see that something is likely to be wrong with the re-write system, but I am wondering what suggestions (tools) some of the 'more seasoned' system admins might have to help me track down this issue or even a better approach to problem solving.
Any tips would be greatly appreciated.
Many Thanks
Walt Reed wrote:
On Tue, Dec 12, 2006 at 09:36:08AM -0800, Administrator said:
Yesterday our website was acting a little odd, and i received a few email complaints. Running 'top' showed that we had a very high load average, today the load average is even higher!
load average: 13.18, 12.49, 10.48
It is pretty unlikely that our site grew in popularity that quickly, and taking a look at the log file it was saying that a certain directory was not found, when in fact it was just a mod-re-write rule and was not supposed to exist. The website works fine... i am stumped and the error log is getting quite big! Does anyone have any tips on how to find the problem and address it accordingly (quickly)? Many thanks
Patient: Doctor, I am ill, but I can't tell you any details. What can I do to feel better?
Doctor: Well, it would be helpful if you give me some more details, or I could ask you about 10,000 questions and we can be here all month...
Seriously, please read: http://linuxmafia.com/faq/Essays/smart-questions.html
Then look at your log files. The contents will probably give you your answer anyway.
Thanks. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Tue, 12 Dec 2006, Administrator wrote:
the repeating error was: File does not exist: /var/www/www.insurancejournal.com/web/news
through some sort of URL re-write.
Continuing to try to trouble shoot the issue i created the news directory. The error then changed to: File does not exist: /var/www/www.insurancejournal.com/web/news/east File does not exist: /var/www/www.insurancejournal.com/web/news/west
.. continuing, some sort of PHP expansion is using URL encoding seeming to present paths --
possible causes: 1) the PHP is broken (unlikely);
2) mod_rwrite got munged, possibly through an update overwriting a config file (unlinkely if hte package manager was being used),
3) the package manager was not being used, (and a hand-compiled part was used), an upgrade of that part occured, and the hand-compiled part, or config's got overwritten (most likely)
for scenarion 3, Diagnostic tool at: http://www.owlriver.com/tips/broken-system/ which can point out broken parts -- it is non-destructive by design, but will not steer a lay person too far in fixing stuff -- it is designed to give hints to a pwerson who will use the package manager
Time to call the person who set up the site in, test the backups and see what changed; if not possible, time to get commercial support, as the diagnosis will be slow through a mailing list
-- Russ Herrold