[CentOS] CentOS 5/6 random system freezes

Billy Crook bcrook at riskanalytics.com
Wed Apr 16 18:55:06 UTC 2014


I'm going to guess that your storage controller or drives was performing
some internal operation or reset during that time, which blocked IO to the
kernel and likely had disk-dependent tasks in iowait.


On Tue, Apr 15, 2014 at 5:25 PM, Andrii Zinchenko <mail at zinok.org> wrote:

> I am working on high load daemon development, which listens on UDP and
> processes packets. Last few months I noticed some strange issue when it
> takes 500-700 ms to answer packet, while usually it takes 20 ms. I've run
> strace on all daemon processes and found this thing:
>
> 13:35:36.979887 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2309,
> ...}) = 0
> 13:35:36.979944 write(3, "[2014-04-15 13:35:36,979] WARNING    WORKER 26 -
> [pkt#105132/AUTH] loadPresets - memory used: 0 kb\n", 99) = 99
> 13:35:37.599793 sendto(10, "Q\0\0\0\rSELECT 1\0", 14, MSG_NOSIGNAL, NULL,
> 0) = 14
> 13:35:37.599865 poll([{fd=10, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=10,
> revents=POLLIN}])
>
> You can see that between write and sendto passed around 600 ms. At this
> time server was not overloaded (LA = 0.4, 16 cores). There were free
> memory. There were no load on disks.
> So I took straces of other daemon processes plus database processes. And
> then did:
>
> grep '13:35:37.4' *
> grep '13:35:37.3' *
> grep '13:35:37.2' *
>
> None of commans show any lines. So I guess the system was just doing
> nothing during this 600 ms.
>
> Is there any way to diagnose this issue? What it might be?
>
> --
> Andrii Zinchenko
> mail at zinok.org
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>



-- 
Billy Crook * Network and Security Administrator * RiskAnalytics, LLC



More information about the CentOS mailing list