[CentOS] weird network error

Tue May 10 17:54:39 UTC 2016
John R Pierce <pierce at hogranch.com>

a previously rock solid reliable server of mine crashed last night, the 
server was still running but eth0, a Intel 82574L using the e1000e 
driver, went down.   The server has a Supermicro X8DTE-F (dual Xeon 
X5650, yada yada).    server is a drbd master, so that was the first 
thing to notice network issues.   Just a couple days ago I ran yum 
update to the latest, I do this about once a month.

/var/log/messages logged...

(prior to this was nothing but normal smbd complaining about CUPS not 
configured).

May  9 22:30:21 sg1 kernel: block drbd0: PingAck did not arrive in time.
May  9 22:30:21 sg1 kernel: block drbd0: peer( Secondary -> Unknown ) 
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May  9 22:30:21 sg1 kernel: block drbd0: asender terminated
May  9 22:30:21 sg1 kernel: block drbd0: Terminating drbd0_asender
May  9 22:30:22 sg1 kernel: block drbd0: new current UUID 
BC856D7A6F94F041:237F4033E81B62DF:1E248D699B6793A9:1E238D699B6793A9
May  9 22:30:22 sg1 kernel: block drbd0: Connection closed
May  9 22:30:22 sg1 kernel: block drbd0: conn( NetworkFailure -> 
Unconnected )
May  9 22:30:22 sg1 kernel: block drbd0: receiver terminated
May  9 22:30:22 sg1 kernel: block drbd0: Restarting drbd0_receiver
May  9 22:30:22 sg1 kernel: block drbd0: receiver (re)started
May  9 22:30:22 sg1 kernel: block drbd0: conn( Unconnected -> 
WFConnection )
May  9 22:30:34 sg1 kernel: ------------[ cut here ]------------
May  9 22:30:34 sg1 kernel: WARNING: at net/sched/sch_generic.c:261 
dev_watchdog+0x26b/0x280() (Not tainted)
May  9 22:30:34 sg1 kernel: Hardware name: ISS3500
May  9 22:30:34 sg1 kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit 
queue 0 timed out
May  9 22:30:34 sg1 kernel: Modules linked in: drbd(U) nfsd max6650 
coretemp adm1021 ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache 
auth_rpcgss nfs_acl sunrpc cpufreq_ondemand acpi_cpufreq freq_table 
mperf ipv6 xfs exportfs microcode iTCO_wdt iTCO_vendor_support joydev 
serio_raw i2c_i801 i2c_core lpc_ich mfd_core e1000e(U) ptp pps_core 
ioatdma dca i7core_edac edac_core ses enclosure sg ext4 jbd2 mbcache 
sd_mod crc_t10dif ahci megaraid_sas mpt2sas scsi_transport_sas 
raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
scsi_wait_scan]
May  9 22:30:34 sg1 kernel: Pid: 0, comm: swapper Not tainted 
2.6.32-573.22.1.el6.x86_64 #1
May  9 22:30:34 sg1 kernel: Call Trace:
May  9 22:30:34 sg1 kernel: <IRQ> [<ffffffff81077821>] ? 
warn_slowpath_common+0x91/0xe0
May  9 22:30:34 sg1 kernel: [<ffffffff81077926>] ? 
warn_slowpath_fmt+0x46/0x60
May  9 22:30:34 sg1 kernel: [<ffffffff8148d64b>] ? dev_watchdog+0x26b/0x280
May  9 22:30:34 sg1 kernel: [<ffffffff8109aded>] ? insert_work+0x6d/0xb0
May  9 22:30:34 sg1 kernel: [<ffffffff81089bd5>] ? 
internal_add_timer+0xb5/0x110
May  9 22:30:34 sg1 kernel: [<ffffffff8148d3e0>] ? dev_watchdog+0x0/0x280
May  9 22:30:34 sg1 kernel: [<ffffffff8108a867>] ? 
run_timer_softirq+0x197/0x340
May  9 22:30:34 sg1 kernel: [<ffffffff8103579d>] ? 
lapic_next_event+0x1d/0x30
May  9 22:30:34 sg1 kernel: [<ffffffff81080361>] ? __do_softirq+0xc1/0x1e0
May  9 22:30:34 sg1 kernel: [<ffffffff810b322f>] ? 
tick_program_event+0x2f/0x40
May  9 22:30:34 sg1 kernel: [<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
May  9 22:30:34 sg1 kernel: [<ffffffff8100fc25>] ? do_softirq+0x65/0xa0
May  9 22:30:34 sg1 kernel: [<ffffffff81080215>] ? irq_exit+0x85/0x90
May  9 22:30:34 sg1 kernel: [<ffffffff815435ba>] ? 
smp_apic_timer_interrupt+0x4a/0x60
May  9 22:30:34 sg1 kernel: [<ffffffff8100bc13>] ? 
apic_timer_interrupt+0x13/0x20
May  9 22:30:34 sg1 kernel: <EOI> [<ffffffff812f1a5e>] ? 
intel_idle+0xfe/0x1b0
May  9 22:30:34 sg1 kernel: [<ffffffff812f1a41>] ? intel_idle+0xe1/0x1b0
May  9 22:30:34 sg1 kernel: [<ffffffff8143413a>] ? 
cpuidle_idle_call+0x7a/0xe0
May  9 22:30:34 sg1 kernel: [<ffffffff81009fe6>] ? cpu_idle+0xb6/0x110
May  9 22:30:34 sg1 kernel: [<ffffffff81532912>] ? 
start_secondary+0x2c0/0x316
May  9 22:30:34 sg1 kernel: ---[ end trace 883800817e091e53 ]---
May  9 22:30:34 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:30:35 sg1 abrt-dump-oops: Reported 1 kernel oopses to Abrt
May  9 22:30:35 sg1 abrtd: Directory 'oops-2016-05-09-22:30:35-8763-1' 
creation detected
May  9 22:30:38 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:30:42 sg1 kernel: Bridge firewalling registered
May  9 22:31:27 sg1 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
May  9 22:31:32 sg1 abrtd: Can't find a meaningful backtrace for hashing 
in '.'
May  9 22:31:32 sg1 abrtd: Preserving oops '.' because 
DropNotReportableOopses is '(not set)'
May  9 22:31:32 sg1 abrtd: Looking for kernel package
May  9 22:31:32 sg1 abrtd: Kernel package 
kernel-2.6.32-573.22.1.el6.x86_64 found
May  9 22:31:33 sg1 abrtd: New problem directory 
/var/spool/abrt/oops-2016-05-09-22:30:35-8763-1, processing
May  9 22:31:33 sg1 abrtd: Sending an email...
May  9 22:31:34 sg1 abrtd: Email was sent to: root at localhost
May  9 22:32:25 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:32:30 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:34:55 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:34:59 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:37:25 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:37:30 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:39:50 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:39:55 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:41:30 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:41:35 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:44:00 sg1 kernel: e1000e 0000:03:00.0: eth0: Reset adapter 
unexpectedly
May  9 22:44:05 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:46:28 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:46:33 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:50:05 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:50:09 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:52:56 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:53:01 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:55:30 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:55:35 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 22:59:17 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 22:59:22 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:01:45 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:01:50 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:05:02 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:05:07 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:07:19 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:07:23 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:09:34 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:09:38 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:11:47 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:11:52 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:14:27 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:14:31 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:16:38 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:16:42 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:19:08 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:19:12 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:22:18 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:22:22 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:26:52 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:26:57 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:31:24 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:31:29 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:33:43 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:33:47 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:36:30 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:36:35 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:39:45 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:39:50 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:41:58 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:42:03 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:45:04 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:45:08 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:47:19 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:47:24 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:52:06 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:52:11 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:55:05 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:55:09 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
May  9 23:57:31 sg1 kernel: e1000e: eth0 NIC Link is Down
May  9 23:57:36 sg1 kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: Rx/Tx
(repeating endlessly til I forced the reboot this morning)



-- 
john r pierce, recycling bits in santa cruz