[CentOS] CentOS 5/6 random system freezes

Wed Apr 16 18:55:06 UTC 2014
Billy Crook <bcrook at riskanalytics.com>

I'm going to guess that your storage controller or drives was performing
some internal operation or reset during that time, which blocked IO to the
kernel and likely had disk-dependent tasks in iowait.


On Tue, Apr 15, 2014 at 5:25 PM, Andrii Zinchenko <mail at zinok.org> wrote:

> I am working on high load daemon development, which listens on UDP and
> processes packets. Last few months I noticed some strange issue when it
> takes 500-700 ms to answer packet, while usually it takes 20 ms. I've run
> strace on all daemon processes and found this thing:
>
> 13:35:36.979887 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2309,
> ...}) = 0
> 13:35:36.979944 write(3, "[2014-04-15 13:35:36,979] WARNING    WORKER 26 -
> [pkt#105132/AUTH] loadPresets - memory used: 0 kb\n", 99) = 99
> 13:35:37.599793 sendto(10, "Q\0\0\0\rSELECT 1\0", 14, MSG_NOSIGNAL, NULL,
> 0) = 14
> 13:35:37.599865 poll([{fd=10, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=10,
> revents=POLLIN}])
>
> You can see that between write and sendto passed around 600 ms. At this
> time server was not overloaded (LA = 0.4, 16 cores). There were free
> memory. There were no load on disks.
> So I took straces of other daemon processes plus database processes. And
> then did:
>
> grep '13:35:37.4' *
> grep '13:35:37.3' *
> grep '13:35:37.2' *
>
> None of commans show any lines. So I guess the system was just doing
> nothing during this 600 ms.
>
> Is there any way to diagnose this issue? What it might be?
>
> --
> Andrii Zinchenko
> mail at zinok.org
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>



-- 
Billy Crook * Network and Security Administrator * RiskAnalytics, LLC