I have two servers with identical hardware ... TYAN i3210w system boards with dual intel gigabit interfaces, and a PCI intel gigabit nic. I'm running Centos 5.4, x86_64, 2.6.18-164.6.1.el5
Every other time I reboot, the nics initialize in a different order.
anaconda had setup /etc/modprobe.conf with alias lines for the cards:
alias eth0 e1000 alias eth1 e1000e alias eth2 e1000e
However, introducing the bonding driver into the mix seems to have thrown a wrench in the works.
alias bond0 bonding options bond0 miimon=80 mode=5 #or something like that, can't get to the machine right now - no console and the network is down
So, I read about the ifcfg options and try assigning each config script an HWADDR line.
after rebooting, /var/log/messages announces that hardware address for eth1 does not match, skipping ... same for eth2
Reading the archives seems to indicate modprobe.conf doesn't do much and is made obsolete by udev, and that hwaddr is discouraged because it's a band-aid. The archives seem to suggest fiddling with udev to be the answer. So I modify /etc/udev/rules.d/60-net (or something) and add a few rules found in an ancient example (those aren't my mac addresses): KERNEL=="eth?", SYSFS{address}=="00:37:e9:17:64:af", NAME="eth0" # MAC of first NIC in lowercase KERNEL=="eth?", SYSFS{address}=="00:21:e9:17:64:b5", NAME="eth1" # MAC of second NIC in lowercase
Now, all three network cards get assigned as eth0! eth1 and eth2 are no longer found. The pci-express nics (onboard) get detected first, and the pci nic is last, so it ends up "owning" the eth0 alias.
I don't really care which alias gets assigned to which nic, but I want that assignment to be constant.
All suggestions are appreciated!
Gordon
On Mon, Nov 23, 2009 at 2:38 AM, Gordon McLellan gordonthree@gmail.com wrote:
The archives seem to suggest fiddling with udev to be the answer. So I modify /etc/udev/rules.d/60-net (or something) and add a few rules found in an ancient example (those aren't my mac addresses): KERNEL=="eth?", SYSFS{address}=="00:37:e9:17:64:af", NAME="eth0" # MAC of first NIC in lowercase KERNEL=="eth?", SYSFS{address}=="00:21:e9:17:64:b5", NAME="eth1" # MAC of second NIC in lowercase
Now, all three network cards get assigned as eth0! eth1 and eth2 are no longer found. The pci-express nics (onboard) get detected first, and the pci nic is last, so it ends up "owning" the eth0 alias.
Changing SYSFS to ATTR should do it.
On Sun, Nov 22, 2009 at 9:33 PM, Tom H tomh0665@gmail.com wrote:
On Mon, Nov 23, 2009 at 2:38 AM, Gordon McLellan gordonthree@gmail.com wrote:
KERNEL=="eth?", SYSFS{address}=="00:21:e9:17:64:b5", NAME="eth1" # Now, all three network cards get assigned as eth0! eth1 and eth2 are no longer found. The pci-express nics (onboard) get detected first, and the pci nic is last, so it ends up "owning" the eth0 alias.
Changing SYSFS to ATTR should do it. _______________________________________________
Tom,
Now I get in the syslog: Unknown key: ATTR{address}
I also tried ATTRS{address} seen in some examples, same error.
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
I also performed chmod +x on the 60-net.rules file, I noticed some other files in rules.d were marked as executable, so I figured it couldn't hurt!
Gordon
On Mon, Nov 23, 2009 at 8:04 AM, Gordon McLellan gordonthree@gmail.com wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Replying to myself here, as I'm going crazy anyway.
It turns out it was just a fluke the server booted up with the correct order. Another reboot and the nic's are all screwed up again, the built in and external card sharing eth0, the second built in as eth1.
On the second server, things are the same even with the new rules, nic drvier order is seemingly chosen at random with each boot.
Any other thoughts and suggestions!?
Gordon
Gordon McLellan wrote:
On Mon, Nov 23, 2009 at 8:04 AM, Gordon McLellan gordonthree@gmail.com wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Replying to myself here, as I'm going crazy anyway.
It turns out it was just a fluke the server booted up with the correct order. Another reboot and the nic's are all screwed up again, the built in and external card sharing eth0, the second built in as eth1.
On the second server, things are the same even with the new rules, nic drvier order is seemingly chosen at random with each boot.
Any other thoughts and suggestions!?
Normally, the nic devices are renamed to match the DEVICE= name specified in the /etc/sysconfig/ifcfg-eth? file with the matching HWADDR= mac address even if they were detected as something else. Can you use these and still layer the bonding on top of them (they don't have to have an IPADDR)? Note that they get the name of the DEVICE= inside the file, not the eth? of the filename if they happen to differ, and it may not work if you don't have matches for every nic.
On Nov 23, 2009, at 8:29 AM, Gordon McLellan gordonthree@gmail.com wrote:
On Mon, Nov 23, 2009 at 8:04 AM, Gordon McLellan <gordonthree@gmail.com
wrote: Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Replying to myself here, as I'm going crazy anyway.
It turns out it was just a fluke the server booted up with the correct order. Another reboot and the nic's are all screwed up again, the built in and external card sharing eth0, the second built in as eth1.
On the second server, things are the same even with the new rules, nic drvier order is seemingly chosen at random with each boot.
Any other thoughts and suggestions!?
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
modprobe.conf associates an alias with a driver, and the ifcfg files associate a MAC address with an alias.
Also for CentOS 5 you can specify the bonding interface options in the ifcfg files (so you can have varying types of bonded interfaces) with MODPROBE_OPTIONS="".
-Ross
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work! SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
I read a while ago that udev overrode ifcfg-* settings so I did a clean install of 5.4 and changed: ifcfg-eth0 to ifcfg-eth9 (file name) eth0 to eth9 (inside the file) the last number of the HWADDR line
The nic came up as eth0 with the old/original mac address after a reboot.
So we unfortunately have to write udev rules when we have nic naming problems...
Tom H wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work! SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
I read a while ago that udev overrode ifcfg-* settings so I did a clean install of 5.4 and changed: ifcfg-eth0 to ifcfg-eth9 (file name) eth0 to eth9 (inside the file) the last number of the HWADDR line
Do you mean that you changed the HWADDR line so it no longer matched the actual nic mac address? In that case, you shouldn't expect it to work.
The nic came up as eth0 with the old/original mac address after a reboot.
So we unfortunately have to write udev rules when we have nic naming problems...
I think the ifcfg-eth? files work when they match the nic mac addresses. They may have to all match for any of them to work, though. I've seen some cases where they all get renamed with a .bak extension and new ones are created but I don't know what triggers that.
On Nov 28, 2009, at 3:10 PM, Les Mikesell lesmikesell@gmail.com wrote:
Tom H wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work! SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
I read a while ago that udev overrode ifcfg-* settings so I did a clean install of 5.4 and changed: ifcfg-eth0 to ifcfg-eth9 (file name) eth0 to eth9 (inside the file) the last number of the HWADDR line
Do you mean that you changed the HWADDR line so it no longer matched the actual nic mac address? In that case, you shouldn't expect it to work.
The nic came up as eth0 with the old/original mac address after a reboot.
So we unfortunately have to write udev rules when we have nic naming problems...
I think the ifcfg-eth? files work when they match the nic mac addresses. They may have to all match for any of them to work, though. I've seen some cases where they all get renamed with a .bak extension and new ones are created but I don't know what triggers that.
Usually a new kernel that forces a regeneration of the hwconf.
There was a kernel update maybe the move from C4 to C5 which caused grief with Dell hardware, where it reversed the order Broadcom devices are detected, still does and needs manual swapping around after install.
-Ross
The formula that ended up working for me:
undo modifications to udev rules comment out the alias ethX lines that anaconda had placed in my modprobe.conf use HWADDR= in the ifcfg-ethX config files. slave interfaces have onboot=yes in them, despite no IP address information
The nics are correctly initialized every boot now, and everything works as expected with the bonding driver. I even have vlans created on the bonded interface.
Gordon
On Sat, Nov 28, 2009 at 2:55 PM, Ross Walker rswwalker@gmail.com wrote:
On Nov 28, 2009, at 3:10 PM, Les Mikesell lesmikesell@gmail.com wrote:
Tom H wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work! SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
I read a while ago that udev overrode ifcfg-* settings so I did a clean install of 5.4 and changed: ifcfg-eth0 to ifcfg-eth9 (file name) eth0 to eth9 (inside the file) the last number of the HWADDR line
Do you mean that you changed the HWADDR line so it no longer matched the actual nic mac address? In that case, you shouldn't expect it to work.
The nic came up as eth0 with the old/original mac address after a reboot.
So we unfortunately have to write udev rules when we have nic naming problems...
I think the ifcfg-eth? files work when they match the nic mac addresses. They may have to all match for any of them to work, though. I've seen some cases where they all get renamed with a .bak extension and new ones are created but I don't know what triggers that.
Usually a new kernel that forces a regeneration of the hwconf.
There was a kernel update maybe the move from C4 to C5 which caused grief with Dell hardware, where it reversed the order Broadcom devices are detected, still does and needs manual swapping around after install.
-Ross
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
NIC ordering is a problem. Some say it is the multi cpu, some say bad BIOS, some say MAC address ordering is better, some say PCI bus enumeration is better. The netdev mailing list has had a long running discussion on this issue. The CTO of Dell and members of HP along with others are / were active participants. Part of the problem is that an alias name may not be available to the kernel.
Dell has their own software to bring determinism to NIC ordering. http://linux.dell.com/papers.shtml
One of Dell's programmers has proposed changing Anaconda to let you choose at installation time the NIC naming convention:
We have been having discussions in the netdev list about creating multiple names for the network interfaces to bring determinism into the way network interfaces are named in the OSes. In specific, "eth0 in the OS does not always map to the integrated NIC Gb1 as labelled on the chassis".
http://marc.info/?l=linux-netdev&m=125510301513312&w=2 - (Re: PATCH: Network Device Naming mechanism and policy) http://marc.info/?l=linux-netdev&m=125619338904322&w=2 - ([PATCH] udev: create empty regular files to represent net)
On Nov 29, 2009, at 3:27 AM, Rob Townley rob.townley@gmail.com wrote:
On Sat, Nov 28, 2009 at 2:55 PM, Ross Walker rswwalker@gmail.com wrote:
There was a kernel update maybe the move from C4 to C5 which caused grief with Dell hardware, where it reversed the order Broadcom devices are detected, still does and needs manual swapping around after install.
NIC ordering is a problem. Some say it is the multi cpu, some say bad BIOS, some say MAC address ordering is better, some say PCI bus enumeration is better. The netdev mailing list has had a long running discussion on this issue. The CTO of Dell and members of HP along with others are / were active participants. Part of the problem is that an alias name may not be available to the kernel.
Dell has their own software to bring determinism to NIC ordering. http://linux.dell.com/papers.shtml
One of Dell's programmers has proposed changing Anaconda to let you choose at installation time the NIC naming convention:
We have been having discussions in the netdev list about creating multiple names for the network interfaces to bring determinism into the way network interfaces are named in the OSes. In specific, "eth0 in the OS does not always map to the integrated NIC Gb1 as labelled on the chassis".
http://marc.info/?l=linux-netdev&m=125510301513312&w=2 - (Re: PATCH: Network Device Naming mechanism and policy) http://marc.info/?l=linux-netdev&m=125619338904322&w=2 - ([PATCH] udev: create empty regular files to represent net)
It's good to hear it's being worked on, but I kinda wish they would revert to the older NIC enumeration method which seemed to get the ordering right.
-Ross
Rob Townley wrote:
NIC ordering is a problem. Some say it is the multi cpu, some say bad BIOS, some say MAC address ordering is better, some say PCI bus enumeration is better. The netdev mailing list has had a long running discussion on this issue. The CTO of Dell and members of HP along with others are / were active participants. Part of the problem is that an alias name may not be available to the kernel.
Dell has their own software to bring determinism to NIC ordering. http://linux.dell.com/papers.shtml
One of Dell's programmers has proposed changing Anaconda to let you choose at installation time the NIC naming convention:
We have been having discussions in the netdev list about creating multiple names for the network interfaces to bring determinism into the way network interfaces are named in the OSes. In specific, "eth0 in the OS does not always map to the integrated NIC Gb1 as labelled on the chassis".
http://marc.info/?l=linux-netdev&m=125510301513312&w=2 - (Re: PATCH: Network Device Naming mechanism and policy) http://marc.info/?l=linux-netdev&m=125619338904322&w=2 - ([PATCH] udev: create empty regular files to represent net)
Do any of these approaches help with the scenario where you want to clone a system across many identical machines including future additions where you don't know the MAC addresses yet, and you'd like the remote operator to be able to insert a drive and have it come up with the right interfaces on the right network connections? This was possible in Centos 3.x, but not in 5.x.
On Sun, Nov 29, 2009 at 10:57 AM, Les Mikesell lesmikesell@gmail.com wrote:
Rob Townley wrote:
NIC ordering is a problem. Some say it is the multi cpu, some say bad BIOS, some say MAC address ordering is better, some say PCI bus enumeration is better. The netdev mailing list has had a long running discussion on this issue. The CTO of Dell and members of HP along with others are / were active participants. Part of the problem is that an alias name may not be available to the kernel.
Dell has their own software to bring determinism to NIC ordering. http://linux.dell.com/papers.shtml
One of Dell's programmers has proposed changing Anaconda to let you choose at installation time the NIC naming convention:
We have been having discussions in the netdev list about creating multiple names for the network interfaces to bring determinism into the way network interfaces are named in the OSes. In specific, "eth0 in the OS does not always map to the integrated NIC Gb1 as labelled on the chassis".
http://marc.info/?l=linux-netdev&m=125510301513312&w=2 - (Re: PATCH: Network Device Naming mechanism and policy) http://marc.info/?l=linux-netdev&m=125619338904322&w=2 - ([PATCH] udev: create empty regular files to represent net)
Do any of these approaches help with the scenario where you want to clone a system across many identical machines including future additions where you don't know the MAC addresses yet, and you'd like the remote operator to be able to insert a drive and have it come up with the right interfaces on the right network connections? This was possible in Centos 3.x, but not in 5.x.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Yes Les.
On Nov 28, 2009, at 2:15 PM, Tom H tomh0665@gmail.com wrote:
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work! SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Don't touch udev, expecting admins to write udev rules for network interface binding is just not realistic. Udev rules are meant to be static across hardware reconfigurations while ifcfg files are meant to be modified to suit your current configuration.
Use HWADDR="00:1b:21:4d:c3:e8" in the ifcfg files along with NAME=eth0 for eth0 and so on.
I read a while ago that udev overrode ifcfg-* settings so I did a clean install of 5.4 and changed: ifcfg-eth0 to ifcfg-eth9 (file name) eth0 to eth9 (inside the file) the last number of the HWADDR line
The nic came up as eth0 with the old/original mac address after a reboot.
So we unfortunately have to write udev rules when we have nic naming problems...
Did you also change the alias names in modprobe.conf?
-Ross
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
It turns out it was just a fluke the server booted up with the correct order. Another reboot and the nic's are all screwed up again, the built in and external card sharing eth0, the second built in as eth1.
On the second server, things are the same even with the new rules, nic drvier order is seemingly chosen at random with each boot.
Do you have anything else in 60-net.rules other than these three rules?
Try prepending "KERNEL=="eth*, "" and/or "ACTION=="add, "".
KERNEL=="eth?", SYSFS{address}=="00:21:e9:17:64:b5", NAME="eth1" # Now, all three network cards get assigned as eth0! eth1 and eth2 are no longer found. The pci-express nics (onboard) get detected first, and the pci nic is last, so it ends up "owning" the eth0 alias.
Changing SYSFS to ATTR should do it.
Tom,
Now I get in the syslog: Unknown key: ATTR{address}
I also tried ATTRS{address} seen in some examples, same error.
Digging around google a bit more I came up with different rules, and fingers crossed, they seem to work!
SUBSYSTEM=="net", SYSFS{address}=="00:1b:21:4d:c3:e8", NAME="eth0" # pro/1000gt SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:30", NAME="eth1" # internal 1 SUBSYSTEM=="net", SYSFS{address}=="00:e0:81:b5:7a:31", NAME="eth2" # internal 2
Sorry. I was at an F12 box and udev uses there (and on U9.10) "ATTR{address}". For CentOS, it is "SYSFS{address}" as you are using... :(
On 11/22/2009 8:38 PM, Gordon McLellan wrote:
I have two servers with identical hardware ... TYAN i3210w system boards with dual intel gigabit interfaces, and a PCI intel gigabit nic. I'm running Centos 5.4, x86_64, 2.6.18-164.6.1.el5
Every other time I reboot, the nics initialize in a different order.
On the servers where I'm currently using bonding... (this is what Ross Walker said on the 23rd). Here's an example for a server w/ 4 total NICs, bonded into a pair of pairs.
/etc/modprobe.conf
alias eth0 tg3 alias eth1 tg3 alias eth2 forcedeth alias eth3 forcedeth alias scsi_hostadapter sata_nv # BONDING # Set general bonding options (allows multiple bonds) options bonding max_bonds=2 # Define the two bonds alias bond0 bonding alias bond1 bonding
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0 BOOTPROTO=none HWADDR=00:16:36:##:##:## ONBOOT=yes MASTER=bond0 SLAVE=yes USERCTL=no TYPE=Ethernet
/etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0 BOOTPROTO=none ONBOOT=yes USERCTL=no TYPE=Ethernet BONDING_OPTS="mode=1 miimon=100" NETWORK=nnn.nnn.nnn.nnn NETMASK=nnn.nnn.nnn.nnn IPADDR=nnn.nnn.nnn.nnn GATEWAY=nnn.nnn.nnn.nnn
Basically, we create (1) file for each ethernet interface under /etc/sysconfig/network-scripts (ifcfg-eth0, ifcfg-eth1, ifcfg-eth2, ifcfg-eth3), then we create (1) file for each bonded interface there as well (ifcfg-bond0, ifcfg-bond1). Bond membership is defined in the ifcfg-eth# files, while the bond options are defined in the ifcfg-bond# file.
You can find out MACs by looking /etc/sysconfig/hwconf.