I reimaged a compute node on our cluster with the latest 5.3 updates (we were previously running 5.2), but we kept the kernel at 2.6.18-92.1.10.el5 until I can find time to rebuild some of our kernel modules. After the image install finishes and the system reboots, the eth0 ethernet interface disappears. If I do an ifconfig a, I see what should be eth0, but it¹s listed as __tmp2081258173.
[root@node0770 ~]# ifconfig -a __tmp2081258173 Link encap:Ethernet HWaddr 00:1E:68:86:67:04 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:66
The dmesg output isn¹t very helpful:
[root@node0770 ~]# dmesg|grep eth0 eth0: forcedeth.c: subsystem: 0108e:534b bound to 0000:00:08.0
If I remove our lustre modules that were built for the 2.6.18-92.1.10.el5 kernel and reboot, the eth0 interface reappears. Another piece to this puzzle is that this problem only seems to happen on our Sun X2200¹s. Our Dell 1950¹s work just fine after putting on the 5.3 updates. Anyone know what could cause this behavior?
Thanks, Randy
At Tue, 19 May 2009 09:04:43 -0400 CentOS mailing list centos@centos.org wrote:
I reimaged a compute node on our cluster with the latest 5.3 updates (we were previously running 5.2), but we kept the kernel at 2.6.18-92.1.10.el5 until I can find time to rebuild some of our kernel modules. After the image install finishes and the system reboots, the eth0 ethernet interface disappears. If I do an ifconfig a, I see what should be eth0, but it¹s listed as __tmp2081258173.
[root@node0770 ~]# ifconfig -a __tmp2081258173 Link encap:Ethernet HWaddr 00:1E:68:86:67:04 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:66
The dmesg output isn¹t very helpful:
[root@node0770 ~]# dmesg|grep eth0 eth0: forcedeth.c: subsystem: 0108e:534b bound to 0000:00:08.0
If I remove our lustre modules that were built for the 2.6.18-92.1.10.el5 kernel and reboot, the eth0 interface reappears. Another piece to this puzzle is that this problem only seems to happen on our Sun X2200¹s. Our Dell 1950¹s work just fine after putting on the 5.3 updates. Anyone know what could cause this behavior?
Check /etc/modprobe.conf (and /etc/sysconfig/network-scripts/if-cfg-eth0) -- if you are doing a disk-to-disk backup type of install, the alias for eth0 is very likely wrong (and the HW address in /etc/sysconfig/network-scripts/if-cfg-eth0 is also wrong). You may have to manually update these two files on the 'new' machine, since it likely has a different NIC, requiring a different driver. It will also have a different MAC (HW) address as well. In the old days, kudzu would detect this and pop up during the boot process.
What does lspci display?
Thanks, Randy MIME-Version: 1.0
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
From: Robert Heller heller@deepsoft.com Organization: Deepwoods Software Reply-To: CentOS mailing list centos@centos.org Date: Tue, 19 May 2009 09:46:15 -0400 To: CentOS mailing list centos@centos.org Cc: centos@centos.org Subject: Re: [CentOS] Weird CentOS 5.3 problem
At Tue, 19 May 2009 09:04:43 -0400 CentOS mailing list centos@centos.org wrote:
I reimaged a compute node on our cluster with the latest 5.3 updates (we were previously running 5.2), but we kept the kernel at 2.6.18-92.1.10.el5 until I can find time to rebuild some of our kernel modules. After the image install finishes and the system reboots, the eth0 ethernet interface disappears. If I do an ifconfig Âa, I see what should be eth0, but it¹s listed as __tmp2081258173.
[root@node0770 ~]# ifconfig -a __tmp2081258173 Link encap:Ethernet HWaddr 00:1E:68:86:67:04 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:66
The dmesg output isn¹t very helpful:
[root@node0770 ~]# dmesg|grep eth0 eth0: forcedeth.c: subsystem: 0108e:534b bound to 0000:00:08.0
If I remove our lustre modules that were built for the 2.6.18-92.1.10.el5 kernel and reboot, the eth0 interface reappears. Another piece to this puzzle is that this problem only seems to happen on our Sun X2200¹s. Our Dell 1950¹s work just fine after putting on the 5.3 updates. Anyone know what could cause this behavior?
Check /etc/modprobe.conf (and /etc/sysconfig/network-scripts/if-cfg-eth0) -- if you are doing a disk-to-disk backup type of install, the alias for eth0 is very likely wrong (and the HW address in /etc/sysconfig/network-scripts/if-cfg-eth0 is also wrong). You may have to manually update these two files on the 'new' machine, since it likely has a different NIC, requiring a different driver. It will also have a different MAC (HW) address as well. In the old days, kudzu would detect this and pop up during the boot process.
What does lspci display?
We add the two lines at the end of modprobe.conf for lustre.
alias eth0 tg3 alias eth1 tg3 alias eth2 forcedeth alias eth3 forcedeth alias scsi_hostadapter sata_nv options lnet networks="tcp0(eth0)" options ksocklnd enable_irq_affinity=0
The /etc/sysconfig/network-scripts/ifcfg-eth0 has the correct settings for this host. We actually generate this file during the post-install. Here's what it looks like:
DEVICE=eth0 BOOTPROTO=none STARTMODE=onboot ONBOOT=yes USERCTL=no TYPE=Ethernet IPV6INIT=no IPADDR=192.168.3.91 BROADCAST=192.168.255.255 NETMASK=255.255.0.0 GATEWAY=192.168.100.1
Here's the lspci output:
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3) 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:06.0 PCI bridge: nVidia Corporation MCP55 PCI bridge (rev a2) 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) 00:0a.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Link Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Miscellaneous Control 00:19.4 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Link Control 01:05.0 VGA compatible controller: ASPEED Technology, Inc. AST2000 02:00.0 Ethernet controller: MYRICOM Inc. Myri-10G Dual-Protocol NIC 05:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev b5) 06:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) 06:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3)
We tried upgrading to the latest tg3 ethernet driver, but no change in the symptoms.
-Randy