We've been running ipv6 for a year or so now, but some of our newer instances (all on an ESX cluster) are not working. It looks like it's all of our Centos 6 instances. I'm hoping someone can point me in the right direction...
tshark indicates that it's neighbor discovery that's failing:
<centos666.peak.org> [26] # cat ../network NETWORKING=yes HOSTNAME=centos666.peak.org GATEWAY=207.55.16.1 NETWORKING_IPV6=yes IPV6_AUTOTUNNEL=no IPV6_DEFAULTGW=2607:f678::1
<centos666.peak.org> [27] # cat ifcfg-eth0 DEVICE="eth0" NM_CONTROLLED="no" ONBOOT="yes" TYPE=Ethernet BOOTPROTO=none IPADDR=207.55.16.66 PREFIX=22 GATEWAY=207.55.16.1 DNS1=69.59.192.71 DNS2=69.59.192.72 DOMAIN=peak.org DEFROUTE=yes IPV6INIT=yes NAME="System eth0" IPV6_AUTOCONF=no IPV6ADDR=2607:f678::16:66/64 IPV6_DEFROUTE=yes
<centos666.peak.org> [28] # ifconfig eth0 Link encap:Ethernet HWaddr 00:50:56:98:70:8B inet addr:207.55.16.66 Bcast:207.55.19.255 Mask:255.255.252.0 inet6 addr: fe80::250:56ff:fe98:708b/64 Scope:Link inet6 addr: 2607:f678::16:66/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1862 errors:0 dropped:0 overruns:0 frame:0 TX packets:266 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:136703 (133.4 KiB) TX bytes:33944 (33.1 KiB)
ipv6 is working locally:
<centos666.peak.org> [29] # ping6 2607:f678::16:66 PING 2607:f678::16:66(2607:f678::16:66) 56 data bytes 64 bytes from 2607:f678::16:66: icmp_seq=1 ttl=64 time=0.040 ms 64 bytes from 2607:f678::16:66: icmp_seq=2 ttl=64 time=0.038 ms ^C --- 2607:f678::16:66 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1870ms rtt min/avg/max/mdev = 0.038/0.039/0.040/0.001 ms
but not on the lan:
<centos666.peak.org> [5] # ping6 2607:f678::1 PING 2607:f678::1(2607:f678::1) 56 data bytes
From 2607:f678::16:66 icmp_seq=2 Destination unreachable: Address
unreachable
From 2607:f678::16:66 icmp_seq=3 Destination unreachable: Address
unreachable
From 2607:f678::16:66 icmp_seq=4 Destination unreachable: Address
unreachable
From 2607:f678::16:66 icmp_seq=6 Destination unreachable: Address
unreachable
From 2607:f678::16:66 icmp_seq=7 Destination unreachable: Address
unreachable
From 2607:f678::16:66 icmp_seq=8 Destination unreachable: Address
unreachable ^C --- 2607:f678::1 ping statistics --- 8 packets transmitted, 0 received, +6 errors, 100% packet loss, time 7892ms
tshark for that ping:
<centos666.peak.org> [38] # tshark -n -i eth0 ip6 Running as user "root" and group "root". This could be dangerous. Capturing on eth0 0.000000 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation 1.000019 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation 2.000043 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation 4.001030 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation 5.001076 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation 6.001045 2607:f678::16:66 -> ff02::1:ff00:1 ICMPv6 Neighbor solicitation ^C6 packets captured
A centos 5.8 vm with the same virtual network connections works fine:
<ns6.peak.org> [103] # ping6 2607:f678::1 PING 2607:f678::1(2607:f678::1) 56 data bytes 64 bytes from 2607:f678::1: icmp_seq=0 ttl=64 time=1.34 ms 64 bytes from 2607:f678::1: icmp_seq=1 ttl=64 time=0.894 ms 64 bytes from 2607:f678::1: icmp_seq=2 ttl=64 time=0.871 ms
--- 2607:f678::1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2001ms rtt min/avg/max/mdev = 0.871/1.035/1.342/0.219 ms, pipe 2 <ns6.peak.org> [104] # ping6 ipv6.google.com PING ipv6.google.com(pz-in-x67.1e100.net) 56 data bytes 64 bytes from pz-in-x67.1e100.net: icmp_seq=0 ttl=52 time=14.1 ms 64 bytes from pz-in-x67.1e100.net: icmp_seq=1 ttl=52 time=14.3 ms 64 bytes from pz-in-x67.1e100.net: icmp_seq=2 ttl=52 time=14.3 ms
--- ipv6.google.com ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2001ms rtt min/avg/max/mdev = 14.185/14.296/14.380/0.081 ms, pipe 2
<ns6.peak.org> [489] # tshark -i eth0 -n ip6 Running as user "root" and group "root". This could be dangerous. Capturing on eth0 0.000000 2607:f678::56 -> 2607:f678::1 ICMPv6 Echo request 0.000720 2607:f678::1 -> 2607:f678::56 ICMPv6 Echo reply 1.000990 2607:f678::56 -> 2607:f678::1 ICMPv6 Echo request 1.001759 2607:f678::1 -> 2607:f678::56 ICMPv6 Echo reply 2.001004 2607:f678::56 -> 2607:f678::1 ICMPv6 Echo request 2.001841 2607:f678::1 -> 2607:f678::56 ICMPv6 Echo reply 3.002983 2607:f678::56 -> 2607:f678::1 ICMPv6 Echo request 3.003680 2607:f678::1 -> 2607:f678::56 ICMPv6 Echo reply 4.003991 2607:f678::56 -> 2607:f678::1 ICMPv6 Echo request 4.010478 2607:f678::1 -> 2607:f678::56 ICMPv6 Echo reply 4.650129 2607:f678::1 -> 2607:f678::56 ICMPv6 Neighbor solicitation 4.650171 2607:f678::56 -> 2607:f678::1 ICMPv6 Neighbor advertisement 12 packets captured
On 8/10/12 5:50 PM, Stephen Harris wrote:
On Fri, Aug 10, 2012 at 05:24:12PM -0700, Alan Batie wrote:
IPV6_DEFROUTE=yes
Not sure where you get that from.
That's not something normally in our configs, I think it was in the default config the centos 6 installer created, and I only stripped out some of the excess... stuff like that I left in in case it mattered in 6 for some reason... The config on the working centos 5 systems (which is what we use on the centos 6 systems also) is much simpler:
<ns6.peak.org> [113] # cat /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=yes HOSTNAME=ns6.peak.org GATEWAY=207.55.16.1 <ns6.peak.org> [114] # cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 BOOTPROTO=static BROADCAST=207.55.19.255 IPADDR=207.55.16.53 NETMASK=255.255.252.0 ONBOOT=yes TYPE=Ethernet
IPV6INIT=yes IPV6ADDR=2607:f678::56 IPV6_DEFAULTGW=2607:f678::1
FWIW you can see the current routing table with "ip -6 route".
netstat is the one I usually use:
<centos666.peak.org> [39] # ip -6 router Object "router" is unknown, try "ip help". <centos666.peak.org> [40] # netstat -rn -A inet6 Kernel IPv6 routing table Destination Next Hop Flags Metric Ref Use Iface 2607:f678::/64 :: U 256 1 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 2607:f678::1 UG 1 5 0 eth0 ::1/128 :: U 0 1 1 lo 2607:f678::16:66/128 :: U 0 58 1 lo fe80::250:56ff:fe98:708b/128 :: U 0 0 1 lo ff00::/8 :: U 256 0 0 eth0
On Aug 11, 2012 2:00 AM, "Alan Batie" alan@peak.org wrote:
On 8/10/12 5:50 PM, Stephen Harris wrote:
On Fri, Aug 10, 2012 at 05:24:12PM -0700, Alan Batie wrote:
IPV6_DEFROUTE=yes
Not sure where you get that from.
That's not something normally in our configs, I think it was in the default config the centos 6 installer created, and I only stripped out some of the excess... stuff like that I left in in case it mattered in 6 for some reason... The config on the working centos 5 systems (which is what we use on the centos 6 systems also) is much simpler:
<ns6.peak.org> [113] # cat /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=yes HOSTNAME=ns6.peak.org GATEWAY=207.55.16.1 <ns6.peak.org> [114] # cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 BOOTPROTO=static BROADCAST=207.55.19.255 IPADDR=207.55.16.53 NETMASK=255.255.252.0 ONBOOT=yes TYPE=Ethernet
netstat is the one I usually use:
<centos666.peak.org> [39] # ip -6 router Object "router" is unknown, try "ip help". <centos666.peak.org> [40] # netstat -rn -A inet6
With ipv6 in the picture stop using net-tools - they were deprecated a long time ago and there's multiple edge cases and bugs where they don't work properly or lack features... learn to use the iproute2 toolset - ip, ss and tc being the key ones.
And it's ip -6 route not ip -6 router... or ip -6 r s in short ;-)
On 8/11/12 2:17 AM, James Hogarth wrote:
With ipv6 in the picture stop using net-tools - they were deprecated a long time ago and there's multiple edge cases and bugs where they don't work properly or lack features... learn to use the iproute2 toolset - ip, ss and tc being the key ones.
Love gratuitous changes to long standard toolsets... sigh...
And it's ip -6 route not ip -6 router... or ip -6 r s in short ;-)
You're right, sorry, it was late on Friday ;-)
<centos666.peak.org> [50] # ip -6 r unreachable ::/96 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable ::ffff:0.0.0.0/96 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:a00::/24 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:7f00::/24 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:a9fe::/32 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:ac10::/28 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:c0a8::/32 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 unreachable 2002:e000::/19 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 2607:f678::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295 unreachable 3ffe:ffff::/32 dev lo metric 1024 error -101 mtu 16436 advmss 16376 hoplimit 4294967295 fe80::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295 default via 2607:f678::1 dev eth0 metric 1 mtu 1500 advmss 1440 hoplimit 4294967295
Also love quality error messages "-101" is *sooo* informative ;-)
But the last line should be the important one and shows the proper default route in place, though the even more important one is the route to the local net (2607:f678::/64) which also looks right. It's not a routing issue, it's a neighbor discovery issue:
<centos666.peak.org> [55] # ip neigh show 2607:f678::1 dev eth0 FAILED 207.55.16.1 dev eth0 lladdr 00:26:88:f2:9e:80 REACHABLE
Love gratuitous changes to long standard toolsets... sigh...
It's not a recent change and is far from gratuitous....
http://lists.debian.org/debian-devel/2009/03/msg00780.html
Features such as traffic shaping, policy routing and multiple IPs on an interface (not virtual interfaces) either are impossible with net-tools or just don't work very well...
Also love quality error messages "-101" is *sooo* informative ;-)
But the last line should be the important one and shows the proper default route in place, though the even more important one is the route to the local net (2607:f678::/64) which also looks right. It's not a routing issue, it's a neighbor discovery issue:
<centos666.peak.org> [55] # ip neigh show 2607:f678::1 dev eth0 FAILED 207.55.16.1 dev eth0 lladdr 00:26:88:f2:9e:80 REACHABLE
Are you allowing ICMPv6? I don't just mean echo and echo-reply (the pings above) but most of the rest of it too?
IPv6 relies on ICMPv6 heavily for path MTU discovery, neighbour discovery and a whole lot more...
On 8/13/12 12:00 PM, James Hogarth wrote:
Are you allowing ICMPv6? I don't just mean echo and echo-reply (the pings above) but most of the rest of it too?
Yes, the test system has the default ip6tables, but we always permit icmp:
# Firewall configuration written by system-config-firewall # Manual customization of this file is not recommended. *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -A INPUT -p ipv6-icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT -A INPUT -j REJECT --reject-with icmp6-adm-prohibited -A FORWARD -j REJECT --reject-with icmp6-adm-prohibited COMMIT
Yes, the test system has the default ip6tables, but we always permit icmp:
Hmm this is especially weird given the 5.8 systems are working - otherwise I'd have moved the troubleshooting up to vmware or the switch next...
Without access to the machines/switches to traffic dump and check in wireshark you've got me stumped at this point I'm afraid...
On 8/13/12 12:35 PM, James Hogarth wrote:
Hmm this is especially weird given the 5.8 systems are working - otherwise I'd have moved the troubleshooting up to vmware or the switch next...
Migrating one of the vms to the same physical host made them start talking to each other, so it's definitely a vmware issue, though why it's interacting badly with centos 6 is a good question. It could be a bug in vmware tools/the ethernet driver. At least it's narrowed down now...
On Mon, Aug 13, 2012 at 12:46:44PM -0700, Alan Batie wrote:
it's interacting badly with centos 6 is a good question. It could be a bug in vmware tools/the ethernet driver. At least it's narrowed down now...
Are you using vmxnet3 drivers? That had a known bug with small udp packets, but it should be fixed. Maybe you're seeing a similar issue. Try using the e1000 driver in that case.
https://access.redhat.com/knowledge/solutions/67823
An update to close: it's a vmware issue:
* new centos 5 creations exhibit the same behavior * a few months ago, we migrated from an esx 4.0 cluster to a new esx 4.1 cluster * we've just recently started using a new centos 6 template; the centos 6 system that's working was created before the migration * a fresh install on esxi 4.0.0 worked fine; when the vm was restarted on the new cluster, it exhibited the same failure. I'm hoping to get an esx 4.0 instance running to I can try the same test, as it seems that esx 4.0 vms will migrate properly and work. Failing that, we can clone the working centos 6 system for now until we can work with vmware to figure out what's going on...
I found another CentOS 6 system that not only is talking ipv6 properly, but the test system that can't even talk to the router can talk to it. That indicates it's probably something wonky with the network itself...
On 13 August 2012 20:37, Alan Batie alan@peak.org wrote:
I found another CentOS 6 system that not only is talking ipv6 properly, but the test system that can't even talk to the router can talk to it. That indicates it's probably something wonky with the network itself...
Hmm...
I don't have a C6 ipv6 machine but I do have a F17 which might not be too far off in behaviour...
The general recommendation though is that next hop should always be via the local link FE80:: addresses (I notice your public address for gateway there)...
Don't worry about the error -101 - the actual error is the unreachable bit and just says that those networks can't be reached over the lo device - which is understandable...
Here's a look at my F17 server for comparison:
ip -6 r s 2001:0:4137:9e76:24f8:3b1a:2598:7928 via fe80::e291:f5ff:fecc:7919 dev em1 metric 0 cache 2001:0:9d38:953c:2805:3e4f:484e:6234 via fe80::e291:f5ff:fecc:7919 dev em1 metric 0 cache 2001:0:9d38:953c:344a:332e:37f7:24f7 via fe80::e291:f5ff:fecc:7919 dev em1 metric 0 cache 2001:470:97df:1::/64 dev em1 proto kernel metric 256 expires 85879sec unreachable fe80::/64 dev lo proto kernel metric 256 error -101 fe80::/64 dev em1 proto kernel metric 256 default via fe80::e291:f5ff:fecc:7919 dev em1 proto kernel metric 1024 expires 1305sec
I'm using radvd on my network ... but things shouldn't be too far off for static settings...
Is the other C6 system that *is* working on the same vmware server, bare metal or on another virtualization server?
On 8/11/12, Alan Batie alan@peak.org wrote:
We've been running ipv6 for a year or so now, but some of our newer instances (all on an ESX cluster) are not working. It looks like it's all of our Centos 6 instances. I'm hoping someone can point me in the right direction...
<centos666.peak.org> [27] # cat ifcfg-eth0 DEVICE="eth0" NM_CONTROLLED="no" ONBOOT="yes" TYPE=Ethernet BOOTPROTO=none IPADDR=207.55.16.66 PREFIX=22 GATEWAY=207.55.16.1
Not sure if this is related/relevant but I remember reading this bug about problems if PREFIX is used instead of NETMASK.