On 06/09/2014 11:34 AM, James B. Byrne wrote:
On Fri, June 6, 2014 09:58, Alexander Dalloz wrote:
Am 06.06.2014 14:50, schrieb James B. Byrne:
At ~07:40 (UTC-4:00) this morning our gateway host lost its WAN Ethernet adaptor. Subsequent to recovery, which required a reboot, the following
[ ... ]
lspci -tv # provides this device tree
-[0000:00]-+-00.0 Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx DMI Bridge . . . +-1c.0-[01]-- +-1c.4-[02]----00.0 Intel Corporation 82574L Gigabit Network Connection +-1c.5-[03]----00.0 Intel Corporation 82574L Gigabit Network Connection . . .
lspci -v -nn -k -qq -D # provides this information:
. . . 0000:02:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3] Subsystem: Super Micro Computer Inc Device [15d9:10d3] Physical Slot: 0-1 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at fe9e0000 (32-bit, non-prefetchable) [size=128K] I/O ports at dc00 [size=32] Memory at fe9dc000 (32-bit, non-prefetchable) [size=16K] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [a0] MSI-X: Enable+ Count=5 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-25-90-ff-ff-61-74-c0 Kernel driver in use: e1000e Kernel modules: e1000e . . .
I have never run into this before. Can anyone cast any light on what might be going on? Is this an incipient hardware failure with one of the on-board PCI Ethernet adaptors? Is there any relationship with the syn flood that was blacklisted immediately before the failure? I do not thinks so but I need to ask.
Thanks,
https://isc.sans.edu/forums/diary/Intel+Network+Card+82574L+Packet+of+Death/...
http://www.itwalker3.com/2013/02/packet-of-death-attack-a-deadly-dos-against...
Worth to verify in your case.
Alexander
Re: Packet of Death attack: a deadly DoS against Intel NICs
It appears that my problem is caused by something else as the EPROM fingerprint matches the 'good' version (mostly).
ethtool -e eth0 . . . 0x0010:01 01 ff ff 6b 02 d3 10 d9 15 d3 10 ff ff 58 a5 . . . 0x0030:c9 6c 50 31 3e 07 0b 46 84 2d 40 01 00 f0 06 07 . . .
However this matches neither the known 'bad' nor the reputed 'good' EPROM image:
0x0060:00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff
But it seems a lot closer to the 'bad:
0×0060:ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
than to the 'good':
0×0060:20 01 00 40 16 13 ff ff ff ff ff ff ff ff ff ff
I cannot find the file pod-icmp-ping.pcap so I cannot try out the recommended test using tcpreplay. The original Google code page reference is now gone.
However, ping -p 32 -s 1110 192.168.99.1 against the on-board nic adaptors does not shut them down. I infer (so long as there is no great delay between sending the packet of death and its effects made manifest) that this means that the POD was not the cause of our recent difficulty.
Hi,
Don't know if you saw my prior email, but we experienced this exact same problem see log excerpts below: ... Jul 31 17:05:18 wolfpac kernel: pciehp 0000:00:1c.5:pcie04: Card not present on Slot(37) Jul 31 17:05:18 wolfpac kernel: pciehp 0000:00:1c.5:pcie04: Card present on Slot(37) Jul 31 17:05:18 wolfpac kernel: device eth5 left promiscuous mode Jul 31 17:05:19 wolfpac kernel: e1000e 0000:07:00.0: PCI INT A disabled Jul 31 17:05:20 wolfpac ntpd[2726]: Deleting interface #7 eth5, 192.168.198.95#123, interface stats: received=517, sent=522, dropped=0, active_time=108106 secs Jul 31 17:05:20 wolfpac ntpd[2726]: Deleting interface #8 eth5, fe80::290:bff:fe2a:acf3#123, interface stats: received=0, sent=0, dropped=0, active_time=108039 secs ...
This would randomly happen on systems that weren't connected directly to the internet. We experienced this on multiple systems. Since we upgraded to the latest elrepo driver and added pcie_aspm=off to our kernel command line we have never experienced the issue again.