> Date: Friday, February 19, 2016 12:47:54 +0000 > From: Ian B <ibrierley at gmail.com> > > On Fri, Feb 19, 2016 at 12:33 PM, Richard wrote: > >> > Date: Friday, February 19, 2016 11:08:48 +0000 >> > From: Ian B <ibrierley at gmail.com> >> > >> > On Fri, Feb 19, 2016 at 10:56 AM, Ian B <ibrierley at gmail.com> >> > wrote: >> > >> >> Hi all, >> >> >> >> We have a development server we have just tried updating the >> >> kernel & glibc after recent recommendations. Its been stable >> >> previously for a few years with only scheduled reboots. >> >> >> >> Its running >> >> Centos 6.6(final) >> >> 2.6.32-573.18.1.el6.x86_64 >> >> GNU libc 2.12 >> >> >> >> Upgraded via YUM, rebooted, all fine for several hours, and >> >> then network seemed to hang. Not much happening as its a dev >> >> server we are testing before moving to production. >> >> >> >> Googling, I see there is some history of e100e driver having >> >> issues, and I'm wondering if it could be related. >> >> >> >> Does anyone have any thoughts on where to do with it, as I'm >> >> assuming it will hang again later. >> >> >> >> Thanks, Ian >> >> >> >> Feb 18 05:04:36 kernel: WARNING: at net/sched/sch_generic.c:261 >> >> dev_watchdog+0x26d/0x280() (Not tainted) >> >> Feb 18 05:04:36 kernel: Hardware name: X9SCL/X9SCM >> >> Feb 18 05:04:36 kernel: NETDEV WATCHDOG: eth0 (e1000e): >> >> transmit queue 0 timed out >> >> Feb 18 05:04:36 kernel: Modules linked in: ip6t_REJECT >> >> nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack >> >> ip6table_filter ip6_tables ipv6 ext4 jbd2 e1000e serio_raw >> >> i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support shpchp ext3 >> >> jbd mbcache raid1 sd_mod crc_t10dif ahci dm_mirror >> >> dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] >> >> Feb 18 05:04:36 kernel: Pid: 0, comm: swapper Not tainted >> >> 2.6.32-220.4.2.el6.x86_64 #1 >> >> Feb 18 05:04:36 kernel: Call Trace: >> >> Feb 18 05:04:36 kernel: <IRQ> [<ffffffff81069a17>] ? >> >> warn_slowpath_common+0x87/0xc0 >> >> Feb 18 05:04:36 kernel: [<ffffffff81069b06>] ? >> >> warn_slowpath_fmt+0x46/0x50 Feb 18 05:04:36 kernel: >> >> [<ffffffff8144a4fd>] ? dev_watchdog+0x26d/0x280 Feb 18 05:04:36 >> >> kernel: [<ffffffff8108b3fd>] ? insert_work+0x6d/0xb0 Feb 18 >> >> 05:04:36 kernel: [<ffffffff8144a290>] ? dev_watchdog+0x0/0x280 >> >> Feb 18 05:04:36 kernel: [<ffffffff8107c7f7>] ? >> >> run_timer_softirq+0x197/0x340 >> >> Feb 18 05:04:36 kernel: [<ffffffff810a0a10>] ? >> >> tick_sched_timer+0x0/0xc0 Feb 18 05:04:36 kernel: >> >> [<ffffffff8102ad6d>] ? lapic_next_event+0x1d/0x30 Feb 18 >> >> 05:04:36 kernel: [<ffffffff81072001>] ? >> >> __do_softirq+0xc1/0x1d0 Feb 18 05:04:36 kernel: >> >> [<ffffffff81095610>] ? >> >> hrtimer_interrupt+0x140/0x250 >> >> Feb 18 05:04:36 kernel: [<ffffffff8100c24c>] ? >> >> call_softirq+0x1c/0x30 Feb 18 05:04:36 kernel: >> >> [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 Feb 18 05:04:36 >> >> kernel: [<ffffffff81071de5>] ? irq_exit+0x85/0x90 Feb 18 >> >> 05:04:36 kernel: [<ffffffff814f4d70>] ? >> >> smp_apic_timer_interrupt+0x70/0x9b >> >> Feb 18 05:04:36 kernel: [<ffffffff8100bc13>] ? >> >> apic_timer_interrupt+0x13/0x20 >> >> Feb 18 05:04:36 kernel: <EOI> [<ffffffff812c49de>] ? >> >> intel_idle+0xde/0x170 Feb 18 05:04:36 kernel: >> >> [<ffffffff812c49c1>] ? intel_idle+0xc1/0x170 Feb 18 05:04:36 >> >> kernel: [<ffffffff813f9ef7>] ? cpuidle_idle_call+0xa7/0x140 Feb >> >> 18 05:04:36 kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110 >> >> Feb 18 05:04:36 kernel: [<ffffffff814d40ca>] ? >> >> rest_init+0x7a/0x80 Feb 18 05:04:36 kernel: >> >> [<ffffffff81c1ff76>] ? >> >> start_kernel+0x424/0x430 Feb 18 05:04:36 kernel: >> >> [<ffffffff81c1f33a>] ? >> >> x86_64_start_reservations+0x125/0x129 >> >> Feb 18 05:04:36 kernel: [<ffffffff81c1f438>] ? >> >> x86_64_start_kernel+0xfa/0x109 >> >> Feb 18 05:04:36 kernel: ---[ end trace 21915186e9d87b29 ]--- >> >> >> >> modinfo e1000e | grep version >> >> version: 3.2.5-k >> >> srcversion: 8CCA78B3C15DE6229299348 >> >> vermagic: 2.6.32-573.18.1.el6.x86_64 SMP mod_unload >> >> modversions >> >> >> >> >> >> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 Processor >> >> Family DRAM Controller (rev 09) >> >> 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series >> >> Chipset Family USB Enhanced Host Controller #2 (rev 05) >> >> 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series >> >> Chipset Family PCI Express Root Port 1 (rev b5) >> >> 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series >> >> Chipset Family PCI Express Root Port 5 (rev b5) >> >> 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series >> >> Chipset Family USB Enhanced Host Controller #1 (rev 05) >> >> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5) >> >> 00:1f.0 ISA bridge: Intel Corporation C202 Chipset Family LPC >> >> Controller (rev 05) >> >> 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series >> >> Chipset Family SATA AHCI Controller (rev 05) >> >> 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset >> >> Family SMBus Controller (rev 05) >> >> 02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit >> >> Network Connection >> >> 03:03.0 VGA compatible controller: Matrox Electronics Systems >> >> Ltd. MGA G200eW WPCM450 (rev 0a) >> >> >> >> > Just noticed that in the trace, it shows an old kernel, so I >> > don't think grub was automatically selecting the latest kernel. >> > Just wondering what process updates the default to be the latest >> > kernel, and if a problem could be an update but grub selecting >> > an older kernel, but other packages updated ? >> > >> >> If your machine is "running Centos 6.6(final)", but you've >> installed the new kernel and glibc that implies that you are >> selectively applying updates. The 6.7 point release came out last >> fall. In addition to the security implications of not fully >> updating the system you may have missed packages that are >> impacting networking. >> >> You may want to do a full updating of the system and then see how >> it acts -- it's hard to debug a system that may have mis-matched >> pieces. >> >> To see which kernel your grub is set to load by default, look at >> the grub.conf -- the "default=" line (normally "0") indicates >> which of the listed kernels will be selected. >> >> If the "default" value isn't "0", and/or the newest kernel isn't >> the first entry, then you have something mucking with things. >> Check your /etc/sysconfig/kernel file for starters. >> >> > Thanks Richard, > > We currently do all security updates at short notice (as opposed to > everything), via a script. I've amended the grub config and > rebooted to make sure it will reboot into the correct kernel now, > and yes /etc/sysconfig/kernel was different to production servers. > We may try all packages if it continues to be unstable now and > maybe whatever as its on a dev server to test. > > Thanks again, > > Ian As Johnny Hughes pointed out last fall: <https://lists.centos.org/pipermail/centos/2015-December/156697.html> selective updating like that is not supported by CentOS or RHEL. [please don't top post.]