[CentOS] Network hangs after several hours (Centos 6 recently upgraded kernel/glibc)

Richard lists-centos at listmail.innovate.net
Fri Feb 19 12:33:49 UTC 2016



> Date: Friday, February 19, 2016 11:08:48 +0000
> From: Ian B <ibrierley at gmail.com>
>
> On Fri, Feb 19, 2016 at 10:56 AM, Ian B <ibrierley at gmail.com>
> wrote:
> 
>> Hi all,
>> 
>> We have a development server we have just tried updating the
>> kernel & glibc after recent recommendations. Its been stable
>> previously for a few years with only scheduled reboots.
>> 
>> Its running
>> Centos 6.6(final)
>> 2.6.32-573.18.1.el6.x86_64
>> GNU libc 2.12
>> 
>> Upgraded via YUM, rebooted, all fine for several hours, and then
>> network seemed to hang. Not much happening as its a dev server we
>> are testing before moving to production.
>> 
>> Googling, I see there is some history of e100e driver having
>> issues, and I'm wondering if it could be related.
>> 
>> Does anyone have any thoughts on where to do with it, as I'm
>> assuming it will hang again later.
>> 
>> Thanks, Ian
>> 
>> Feb 18 05:04:36 kernel: WARNING: at net/sched/sch_generic.c:261
>> dev_watchdog+0x26d/0x280() (Not tainted)
>> Feb 18 05:04:36 kernel: Hardware name: X9SCL/X9SCM
>> Feb 18 05:04:36 kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit
>> queue 0 timed out
>> Feb 18 05:04:36 kernel: Modules linked in: ip6t_REJECT
>> nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack
>> ip6table_filter ip6_tables ipv6 ext4 jbd2 e1000e serio_raw
>> i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support shpchp ext3 jbd
>> mbcache raid1 sd_mod crc_t10dif ahci dm_mirror dm_region_hash
>> dm_log dm_mod [last unloaded: scsi_wait_scan] Feb 18 05:04:36
>> kernel: Pid: 0, comm: swapper Not tainted
>> 2.6.32-220.4.2.el6.x86_64 #1
>> Feb 18 05:04:36 kernel: Call Trace:
>> Feb 18 05:04:36 kernel: <IRQ>  [<ffffffff81069a17>] ?
>> warn_slowpath_common+0x87/0xc0
>> Feb 18 05:04:36 kernel: [<ffffffff81069b06>] ?
>> warn_slowpath_fmt+0x46/0x50 Feb 18 05:04:36 kernel:
>> [<ffffffff8144a4fd>] ? dev_watchdog+0x26d/0x280 Feb 18 05:04:36
>> kernel: [<ffffffff8108b3fd>] ? insert_work+0x6d/0xb0 Feb 18
>> 05:04:36 kernel: [<ffffffff8144a290>] ? dev_watchdog+0x0/0x280
>> Feb 18 05:04:36 kernel: [<ffffffff8107c7f7>] ?
>> run_timer_softirq+0x197/0x340
>> Feb 18 05:04:36 kernel: [<ffffffff810a0a10>] ?
>> tick_sched_timer+0x0/0xc0 Feb 18 05:04:36 kernel:
>> [<ffffffff8102ad6d>] ? lapic_next_event+0x1d/0x30 Feb 18 05:04:36
>> kernel: [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0 Feb 18
>> 05:04:36 kernel: [<ffffffff81095610>] ?
>> hrtimer_interrupt+0x140/0x250
>> Feb 18 05:04:36 kernel: [<ffffffff8100c24c>] ?
>> call_softirq+0x1c/0x30 Feb 18 05:04:36 kernel:
>> [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 Feb 18 05:04:36
>> kernel: [<ffffffff81071de5>] ? irq_exit+0x85/0x90 Feb 18 05:04:36
>> kernel: [<ffffffff814f4d70>] ?
>> smp_apic_timer_interrupt+0x70/0x9b
>> Feb 18 05:04:36 kernel: [<ffffffff8100bc13>] ?
>> apic_timer_interrupt+0x13/0x20
>> Feb 18 05:04:36 kernel: <EOI>  [<ffffffff812c49de>] ?
>> intel_idle+0xde/0x170 Feb 18 05:04:36 kernel:
>> [<ffffffff812c49c1>] ? intel_idle+0xc1/0x170 Feb 18 05:04:36
>> kernel: [<ffffffff813f9ef7>] ? cpuidle_idle_call+0xa7/0x140 Feb
>> 18 05:04:36 kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
>> Feb 18 05:04:36 kernel: [<ffffffff814d40ca>] ? rest_init+0x7a/0x80
>> Feb 18 05:04:36 kernel: [<ffffffff81c1ff76>] ?
>> start_kernel+0x424/0x430 Feb 18 05:04:36 kernel:
>> [<ffffffff81c1f33a>] ?
>> x86_64_start_reservations+0x125/0x129
>> Feb 18 05:04:36 kernel: [<ffffffff81c1f438>] ?
>> x86_64_start_kernel+0xfa/0x109
>> Feb 18 05:04:36 kernel: ---[ end trace 21915186e9d87b29 ]---
>> 
>> modinfo e1000e | grep version
>> version:        3.2.5-k
>> srcversion:     8CCA78B3C15DE6229299348
>> vermagic:       2.6.32-573.18.1.el6.x86_64 SMP mod_unload
>> modversions
>> 
>> 
>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 Processor
>> Family DRAM Controller (rev 09)
>> 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series
>> Chipset Family USB Enhanced Host Controller #2 (rev 05)
>> 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series
>> Chipset Family PCI Express Root Port 1 (rev b5)
>> 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series
>> Chipset Family PCI Express Root Port 5 (rev b5)
>> 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series
>> Chipset Family USB Enhanced Host Controller #1 (rev 05)
>> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
>> 00:1f.0 ISA bridge: Intel Corporation C202 Chipset Family LPC
>> Controller (rev 05)
>> 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series
>> Chipset Family SATA AHCI Controller (rev 05)
>> 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset
>> Family SMBus Controller (rev 05)
>> 02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
>> Network Connection
>> 03:03.0 VGA compatible controller: Matrox Electronics Systems
>> Ltd. MGA G200eW WPCM450 (rev 0a)
>> 

> Just noticed that in the trace, it shows an old kernel, so I don't
> think grub was automatically selecting the latest kernel. Just
> wondering what process updates the default to be the latest
> kernel, and if a problem could be an update but grub selecting an
> older kernel, but other packages updated ?
> 

If your machine is "running Centos 6.6(final)", but you've installed
the new kernel and glibc that implies that you are selectively
applying updates. The 6.7 point release came out last fall. In
addition to the security implications of not fully updating the
system you may have missed packages that are impacting networking.

You may want to do a full updating of the system and then see how it
acts -- it's hard to debug a system that may have mis-matched pieces.

To see which kernel your grub is set to load by default, look at the
grub.conf -- the "default=" line (normally "0") indicates which of
the listed kernels will be selected.

If the "default" value isn't "0", and/or the newest kernel isn't the
first entry, then you have something mucking with things. Check your
/etc/sysconfig/kernel file for starters.





More information about the CentOS mailing list