a quick and dirty hack to 'fix' the problem in a large scale -- RE: [CentOS] Nic order detection

Sat Jan 12 08:16:11 UTC 2008
Guolin Cheng <guolin at alexa.com>

Les and Michael,

There are a few ways to workaround the NIC detection issue. Each has its
own advantages and limits.

The first method is: suppose you or your team have full control of
running kernel on your hundreds/thousands of boxes, your can then build
some NIC drivers statically in the kernel -- these statically built NIC
drivers will be detected as eth0 without glitches -- then leave other
different NIC types on the same box still in dynamic kernel modules
status. It works greatly if you know all the types of primary network
NIC. Typically e100, tg3, etc. and you have already standardized the 2nd
NIC on the boxes to one or two brands like e1000.

The second method is: suppose you or your team can not control
rebuilding of kernel, or at least you have no full control, but you
really know the types of primary/secondary NICs combinations on all the
Linux boxes in your kingdom. Then you can try the following hack:

 You can try to add/change lines in /lib/modules/`uname -r`/modules.dep
file according to your NICs combinations -- always load the drivers
according to your predefined order. For example:

.../e1000.ko: .../tg3.ko .../3c59x.ko .../e100.ko .../forcedeth.ko
.../forcedeth.ko: .../tg3.ko

The above means to load the module at left, system will first load
modules at right! So tg3|3c59x|e100|forcedeth always load before e1000,
and tg3 load before forcedeth. The same idea can be applied to all NIC
combination types your have and can be set only once and applied to all
your linux boxes if you set it up correctly. The side-effect is: you
have waste few hundreds Kilobytes memory, but who cares?

There are also other tricks I tried before, some works and some not. But
I think the above should probably work for most general cases.

Have a good weekend.

--Guolin


-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
Behalf Of Michael D. Kralka
Sent: Thursday, January 10, 2008 6:52 AM
To: CentOS mailing list
Subject: Re: [CentOS] Nic order detection

Les Mikesell wrote:
> I do have the ifcfg-ethX files for the 2 interfaces that are currently
> active, but since the machines were built by image copies of a master
> disk, they do not have HWADDR address entries.  A person on-site with
> access to the console adjusted them if they didn't come up right the
> first time, but they seem to shift around on each reboot.  Will adding
> the HWADDR entry nail them down even if it doesn't match the nic type
> specified in modprobe.conf?  Can someone point me to the code where
this
> happens?  Until recently the machines were running centos 3.x and this
> seems to be a difference in behavior.

As already pointed out, yes adding HWADDR will "nail them down" and the
entries in modprobe.conf don't mean much. If you (or a script) execute
"modprobe eth0" it will load the appropriate module. Unfortunately, this
is not how CentOS 5 loads drivers.

With CentOS 5, udev is used to load the drivers by looking at the
"modalias" file found for each device under the /sys directory (search
for them, there are many). For PCI devices, the modalias includes the 4
16-bit PCI ID values, the PCI device type, and some other information.

Unfortunately, udev tries to be clever and loads drivers in parallel. As
a result, if there are NICs that use different drivers, the order that
the NICs are assigned ethX interfaces is left to the whim of the Linux
scheduler (i.e. is non-deterministic). Devices using the same driver
will always be assigned interface names in the same relative ordering.
If they all use the same driver, they will always be assigned the same
names, without having to fuss with the HWADDR option (this is due to how
drivers enumerate PCI devices).

In reality, HWADDR doesn't force the kernel to assign the desired
interface to each device. It simply "cleans up" after udev by renaming
the interfaces from what the kernel assigned to each NIC to the
interfaces you expect. Search for "rename_device" in ifup-eth and
network-functions, both found in the /etc/sysconfig/network-scripts
directory.

Cheers,
Michael
_______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos