[CentOS] centos 6.5 input lag

Tue Oct 14 15:13:51 UTC 2014
Matt Garman <matthew.garman at gmail.com>

Update on this problem:

>From another system, I initiated a constant ping on my laggy server.
I noticed that every 10--20 seconds, one or more ICMP packets would
drop.  These drops were consistent with the input lag I was
experiencing.

I did a web search for "linux periodically hangs" and found this
Serverfault post that had a lot in common with my symptoms:

    http://serverfault.com/questions/371666/linux-bonded-interfaces-hanging-periodically

I in fact have bonded interfaces on the laggy server.  When I checked
the bonding config, I realized a while ago I had changed from
balance-rr / mode 0, to 802.3ad / mode 4.  (I did this because I kept
getting "bond0: received packet with own address as source address"
when using balance-rr with a bridge interface.  The bridge interface
was for using KVM.)

For now, I simply disabled one of the slave interfaces, and the lag /
dropped ICMP packets problem has gone away.

Like the Serverfault poster, I have an HP TrueCurve 1800-24g switch.
The switch is supposed to support 802.3ad link aggregation.  It's not
a managed switch, so I (perhaps incorrectly) assumed that 802.3ad
would magically just work.  Either there is more required to make it
work, or it's implementation is broken.  Curiously, however, running
my bond0 in 802.3ad mode did work without any issue for over a month.

Anyway, hopefully this might help someone else struggling with a
similar problem.




On Fri, Oct 10, 2014 at 4:17 PM, Matt Garman <matthew.garman at gmail.com> wrote:
> On Fri, Oct 10, 2014 at 4:11 PM, Joseph L. Brunner
> <joe at affirmedsystems.com> wrote:
>> If this is a server - is it possible your raid card battery died?
>
> It is a server, but a home file server.  The raid card has no battery
> backup, and in fact has been flashed to pure HBA mode.  Actual
> RAID'ing is done at the software level.
>
>> The only other thing on the hardware side that comes to mind is actual bad sectors if this is not a raided virtual drive.
>
> The system has eight total drives: two SSDs in raid-1 for the OS, five
> 3.5 spinning drives in RAID-6, and a single 3.5 drive normally used
> for mythtv recordings (though mythtv has been stopped for a long time
> now to try to debug the issue).
>
>> From the OS side can you keep the box up long enough to do a yum update?
>
> Yes, I updated everything except packages beginning with "l" ("el" /
> lowercase 'L') due to that generating a number of conflicts that I
> haven't have time to resolve.