[CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

Thu Feb 14 04:08:54 UTC 2008
Steven Haigh <netwiz at crc.id.au>

> -----Original Message-----
> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
> Behalf Of nate
> Sent: Thursday, 14 February 2008 2:46 PM
> To: centos at centos.org
> Subject: Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.
> 
> Indunil Jayasooriya wrote:
> 
> > I also got this type of probles once before. pls check initrd image.
> > pls performe below steps.
> >
> While it's always good to make sure your initrd is in a good state,
> the network drivers don't need to be in the initrd (unless your booting
> from NFS or something). They can be loaded fine from
> /lib/modules/`uname -r`
> 
> What kind of network chip(s) are in the system? What driver are they
> using?(/etc/modprobe.conf), it'd be helpful to have the output of
> dmesg as well from the kernel that doesn't provide networking support.

The network is an e100 - dmesg shows the following:
	# dmesg | grep e100:
	e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
	e100: Copyright(c) 1999-2005 Intel Corporation
	e100: eth0: e100_probe: addr 0xdfffe000, irq 169, MAC addr
00:02:B3:8B:BE:26
	e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex

Of course, this doesn't give us the exact chip, however mii-tool is a bit
more helpful:
	# mii-tool -v eth0
	eth0: negotiated 100baseTx-FD, link ok
	  product info: Intel 82555 rev 4
	  basic mode:   autonegotiation enabled
	  basic status: autonegotiation complete, link ok
	  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
	  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
flow-control
	  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

The interesting part for me however, is that certain things unrelated to the
network also fail. I would expect iptables to come up as OK on boot - even
if no network device was configured - as its independent of network
configuration. It also doesn't explain how the firmware microcode update
also fails.

> You could write a script for some person at the remote co-lo to execute
> when the system comes up w/o network, the results could be stored in
> a file on the disk and when the system is rebooted again under the
> old kernel you can examine them for possible causes.
> 
> Some commands to try:
> dmesg
> ifconfig -a
> mii-tool
> route -n
> ping -c 5 (IP of default gateway)
> arping -c 5 (IP of default gateway)
> arp -an
> lsmod

I have a bit of trouble with this, as the only person that can do it is
around 30 minutes travel from the colo. As the system boots, I'm thinking of
writing a script that will gather this, then reboot the system after
changing the default=x line in /etc/grub.conf - however obviously I want to
make sure it works 100% before I tell the machine to reboot ;)

--
Steven Haigh

Email: netwiz at crc.id.au
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897