Hi, folks,
I and the other admin here have just been assigned a mission... here's what's happening: only very recently - the last week? our CentOS 7 boxes, or at least some of them, will lose their IPv6 addresses, and not get it back.
1. We're running dibbler on the same box that serves DHCP. 2. It's been working for many months. 3. The lease file remains in place. 4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
One very show-stopping result of this is that NFS starts timing out.
So: has anyone else seen this behaviour recently, and does anyone have some idea of what might be going on here?
mark
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
NM tends to log fairly verbose information. It sounds like you've looked at the network traffic. Have you looked at the logs on the affected systems?
On 07/12/17 12:09, Gordon Messmer wrote:
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
NM tends to log fairly verbose information. It sounds like you've looked at the network traffic. Have you looked at the logs on the affected systems?
Sorry, never got around to answering earlier, and we were all still looking.
First, a correction: the same thing is happening on C6 servers.
Next, there is *nothing*, not in dmesg*, not in /var/log/messages, to indicate when it failed, nor any failure message. No indication why the daemon didn't restart it.
* Ok, I've got one good thing to say about C7: dmesg -H. Love it.
mark
On Wed, 12 Jul 2017 19:22:20 -0400 mark m.roth@5-cent.us wrote:
On 07/12/17 12:09, Gordon Messmer wrote:
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
...
NM tends to log fairly verbose information. It sounds like you've looked at the network traffic. Have you looked at the logs on the affected systems?
...
Next, there is *nothing*, not in dmesg*, not in /var/log/messages, to indicate when it failed, nor any failure message. No indication why the daemon didn't restart it.
- Ok, I've got one good thing to say about C7: dmesg -H. Love it.
Maybe I can get that up to two good things...
# journalctl -u NetworkManager # with optional -r for newest first
Is rather convenient when looking for logs for a specific unit/service. A small added complexity is that there are two typically active units. "NetworkManager.service" and "NetworkManager-dispatcher.service" (.service can be omitted).
/Peter
On 07/12/2017 04:22 PM, mark wrote:
On 07/12/17 12:09, Gordon Messmer wrote:
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
Next, there is *nothing*, not in dmesg*, not in /var/log/messages, to indicate when it failed, nor any failure message. No indication why the daemon didn't restart it.
Where did your manager observe the conversation you described above? If you're watching traffic at the DHCP server, have you tried also capturing traffic on one of the affected clients?
Following the failure, does the client have an IPv4 address and no IPv6 address? Or does it have an IPv6 address and no default route?
Gordon Messmer wrote:
On 07/12/2017 04:22 PM, mark wrote:
On 07/12/17 12:09, Gordon Messmer wrote:
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
Next, there is *nothing*, not in dmesg*, not in /var/log/messages, to indicate when it failed, nor any failure message. No indication why the daemon didn't restart it.
Where did your manager observe the conversation you described above? If you're watching traffic at the DHCP server, have you tried also capturing traffic on one of the affected clients?
Following the failure, does the client have an IPv4 address and no IPv6 address? Or does it have an IPv6 address and no default route?
No IPv6. IIRC, no default, either.
mark
On 07/12/2017 04:22 PM, mark wrote:
On 07/12/17 12:09, Gordon Messmer wrote:
On 07/12/2017 07:13 AM, m.roth@5-cent.us wrote:
4. It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
Next, there is *nothing*, not in dmesg*, not in /var/log/messages, to indicate when it failed, nor any failure message. No indication why the daemon didn't restart it.
Where did your manager observe the conversation you described above? If you're watching traffic at the DHCP server, have you tried also capturing traffic on one of the affected clients?
On 12/07/17 16:13, m.roth@5-cent.us wrote:
Hi, folks,
I and the other admin here have just been assigned a mission... here's what's happening: only very recently - the last week? our CentOS 7 boxes, or at least some of them, will lose their IPv6 addresses, and not get it back.
- We're running dibbler on the same box that serves DHCP.
- It's been working for many months.
- The lease file remains in place.
- It appears to try, several times, and then give up - as our manager puts it, "I to renew the lease", "Here it is","Nope, don't like that, try again", and eventually, after 4 or 5 or so tries, gives up.
One very show-stopping result of this is that NFS starts timing out.
So: has anyone else seen this behaviour recently, and does anyone have some idea of what might be going on here?
mark
I admit that I'm a big fan of either static ipv6 or then just slaac/radvd for automatic addr assignement.
But I was faced once with that dibbler problem, but don't know how the dibbler daemon was configured (nor how it's configured at your side either).
From the discussion I had with the DC support people (online.net, hosting company in France) they wanted me to use a dibbler client, which I didn't want to, and they wanted me to specific the DUID that dibbler at the server side would use to recognize the dhclient request.
So here is what I did (worth knowing that ipv6.method is set to 'ignore' from a NM PoV) :
create the /etc/dhcp/dhclient.d/dhclient6.conf : interface "eth0" { send dhcp6.client-id my:long:duid:id:that:dibbler:wants:bla:etc ; }
And then "plumb" it in a NetworkManager dispatcher.d script (I *really* like dispatcher.d script as you can take action when some interface are up/down, etc ....):
/etc/NetworkManager/dispatcher.d/99-ipv6-online.sh :
#!/bin/bash IF=$1 STATUS=$2
if [[ "$IF" = "eth0" && "$STATUS" = "up" ]] ; then logger "IF $IF status changed to $STATUS" sleep 10 logger launching ipv6 client for $IF /usr/sbin/dhclient -cf /etc/dhcp/dhclient.d/dhclient6.conf -6 -P -v eth0 -nw
fi
YMMV but I hope that it will help you
PS : never looked at this again but maybe NM has now a way to specify that directly ?