Most of my desktops are still running CentOS5, but I have installed CentOS6 on a few of them. The users on those desktops are reporting that DNS lookups are slow, and from my brief tests, that does appear to be the case. After some googling, I found a suggestion to disable IPv6, but that didn't help. So I tried to figure out where the delay occurs using strace and ltrace, but that didn't help much. When running ltrace, the process I'm trying to trace (host, nslookup) just dies. And the output of strace does show a 3+ second delay, but I can't figure out what it's telling me:
10:46:25.317517 futex(0xd1ba94, FUTEX_WAKE_PRIVATE, 2147483647) = 0 10:46:25.317586 open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 6 10:46:25.317635 fstat64(6, {st_mode=S_IFREG|0644, st_size=99158720, ...}) = 0 10:46:25.317702 mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 6, 0) = 0xb569d000 10:46:25.317754 mmap2(NULL, 1171456, PROT_READ, MAP_PRIVATE, 6, 0x19c) = 0xb557f000 10:46:25.317792 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE, 6, 0x1084) = 0xb557e000 10:46:25.317827 close(6) = 0 10:46:25.317904 futex(0xb76a5050, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0xb76a504c, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 10:46:25.317990 futex(0xb76a5018, FUTEX_WAKE_PRIVATE, 1) = 1 10:46:25.318037 rt_sigaction(SIGHUP, {0x141390, ~[RTMIN RT_1], 0}, NULL, 8) = 0 10:46:25.318145 rt_sigsuspend([]) = ? ERESTARTNOHAND (To be restarted) 10:46:28.510794 --- SIGTERM (Terminated) @ 0 (0) --- 10:46:28.510842 sigreturn() = ? (mask now [HUP INT TERM]) 10:46:28.510926 futex(0xb76a5050, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0xb76a504c, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 10:46:28.511015 futex(0xb76a5018, FUTEX_WAKE_PRIVATE, 1) = 1 10:46:28.511070 futex(0xb76a5050, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0xb76a5018, 18) = 1 10:46:28.511144 futex(0xb76a5018, FUTEX_WAKE_PRIVATE, 1) = 1 10:46:28.511256 futex(0xb769fbd8, FUTEX_WAIT, 2371, NULL) = -1 EAGAIN (Resource temporarily unavailable) 10:46:28.511316 brk(0x920a000) = 0x920a000
The CentOS5 systems on the same network with the same /etc/resolv.conf file do not have these delays. Has anyone else seen this or have some suggestions as to how to debug this?
Alfred
On Tue, 27 Sep 2011, Alfred von Campe wrote:
Most of my desktops are still running CentOS5, but I have installed CentOS6 on a few of them. The users on those desktops are reporting that DNS lookups are slow, and from my brief tests, that does appear to be the case. After some googling, I found a suggestion to disable IPv6, but that didn't help. So I tried to figure out where the delay occurs using strace and ltrace, but that didn't help much. When running ltrace, the process I'm trying to trace (host, nslookup) just dies. And the output of strace does show a 3+ second delay, but I can't figure out what it's telling me:
The CentOS5 systems on the same network with the same /etc/resolv.conf file do not have these delays. Has anyone else seen this or have some suggestions as to how to debug this?
You probably want to do an strace -f host blah rather than a basic strace, or I think you'll lose what's going on.
Are you running either sssd or nscd on your box?
I've done nothing special, and DNS seems fast enough.
jh
On Sep 27, 2011, at 11:29, John Hodrien wrote:
You probably want to do an strace -f host blah rather than a basic strace, or I think you'll lose what's going on.
Good point. Using -f doesn't show a 3+ second gap, but I still have no idea why it's slow (3-5 seconds) compared to CentOS5, or why using ltrace causes host or nslookup to crash. Can anybody else get ltrace to work with host?
Are you running either sssd or nscd on your box?
Yes, I am running nscd, but I didn't think it cached DNS queries.
I've done nothing special, and DNS seems fast enough.
And I have done nothing different than what I do in CentOS5, yet it is much slower (by at least an order of magnitude).
On Sep 27, 2011, at 13:05, Frank Cox wrote:
Have you considered installing dnsmasq on those machines?
No, this is in a corporate environment, and the queries that are slow are for names outside of our domain (i.e., the Internet), so I don't think it would help.
Any other ideas on how to diagnose the root cause?
Alfred
On Tue, 27 Sep 2011 13:28:19 -0400 Alfred von Campe wrote:
Have you considered installing dnsmasq on those machines?
No, this is in a corporate environment, and the queries that are slow are for names outside of our domain (i.e., the Internet), so I don't think it would help.
Why do you think dnscache won't help? Caching is not restricted to your local domain.
On Sep 27, 2011, at 13:53, Frank Cox wrote:
Why do you think dnscache won't help? Caching is not restricted to your local domain.
I guess I forgot to mention that only the first query is slow. If you repeat the query, the response is fast, so it's already being cached somewhere. I always assumed that's how DNS worked. So in order to test this "slowness", I have to keep thinking of domains to look up.
Alfred
On Tue, Sep 27, 2011 at 1:02 PM, Alfred von Campe alfred@von-campe.com wrote:
Why do you think dnscache won't help? Caching is not restricted to your local domain.
I guess I forgot to mention that only the first query is slow. If you repeat the query, the response is fast, so it's already being cached somewhere. I always assumed that's how DNS worked. So in order to test this "slowness", I have to keep thinking of domains to look up.
DNS servers normally do cache, but clients don't. Are you running named locally on each machine and pointing resolv.conf to localhost? It wouldn't make much sense for a central DNS server to act differently depending on whether a c5 or c6 client was first to ask for a name lookup.
On Tuesday 27 Sep 2011 19:08:53 Les Mikesell wrote:
On Tue, Sep 27, 2011 at 1:02 PM, Alfred von Campe alfred@von-campe.com
wrote:
Why do you think dnscache won't help? Caching is not restricted to your local domain.
I guess I forgot to mention that only the first query is slow. If you repeat the query, the response is fast, so it's already being cached somewhere. I always assumed that's how DNS worked. So in order to test this "slowness", I have to keep thinking of domains to look up.
DNS servers normally do cache, but clients don't. Are you running named locally on each machine and pointing resolv.conf to localhost?
He's running nscd which caches DNS (group 'hosts' in /etc/nscd.conf).
On Tue, Sep 27, 2011 at 12:28 PM, Alfred von Campe alfred@von-campe.com wrote:
On Sep 27, 2011, at 11:29, John Hodrien wrote:
You probably want to do an strace -f host blah rather than a basic strace, or I think you'll lose what's going on.
Good point. Using -f doesn't show a 3+ second gap, but I still have no idea why it's slow (3-5 seconds) compared to CentOS5, or why using ltrace causes host or nslookup to crash. Can anybody else get ltrace to work with host?
The usual reason for a delay is that you have more than one nameserver specified in resolv.conf and the first one tried is down or unreachable so you time out and retry. Try "dig @namserver hostname" with each of the nameserver addresses to see if they are working or just slow.
On Sep 27, 2011, at 14:02, Les Mikesell wrote:
The usual reason for a delay is that you have more than one nameserver specified in resolv.conf and the first one tried is down or unreachable so you time out and retry.
Bingo! Thanks Les. All systems use DHCP which updates the resolv.conf file, and for some reason the CentOS6 systems had their two entries in a different order then the CentOS5 systems. Well, now that I checked all systems, it's not a CentOS6 vs. CentOS5 issue, but rather just "luck" of the draw, and my small sample size led me to believe that it was a CentOS6 issue. One of our two main internal servers appears to be having some issues. I'll have to follow up with CIS to figure out what is going on.
Alfred
On 09/27/2011 12:10 PM, Alfred von Campe wrote:
On Sep 27, 2011, at 14:02, Les Mikesell wrote:
The usual reason for a delay is that you have more than one nameserver specified in resolv.conf and the first one tried is down or unreachable so you time out and retry.
Bingo! Thanks Les. All systems use DHCP which updates the resolv.conf file, and for some reason the CentOS6 systems had their two entries in a different order then the CentOS5 systems. Well, now that I checked all systems, it's not a CentOS6 vs. CentOS5 issue, but rather just "luck" of the draw, and my small sample size led me to believe that it was a CentOS6 issue. One of our two main internal servers appears to be having some issues. I'll have to follow up with CIS to figure out what is going on.
Alfred
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
There are also ways to statically override the dns servers returned by dhcp.
I have not set this up under CentOS 6. Currently in my /etc/dhclient.conf (or wherever the dhclient.conf file is in CentOS 6) I have an entry that looks like this:
interface "eth2" { prepend domain-name-servers 127.0.0.1; }
which forces my own list of domain servers to be prepended to the ones dynamically returned by the dhcp server on the network attached to eth2. IT would of course be a good idea for your dhcp servers to return the ip address of working dns servers, but this could be used if you have reason to want to choose your own servers.
Alternatively, you can specify
supercede domain-name-servers a.b.c.d, a.b.x.y;
to completely replace the servers returned via dhcp.
Nataraj
In article 4E824DF6.1040103@rjl.com, Nataraj incoming-centos@rjl.com wrote:
Alternatively, you can specify
supercede domain-name-servers a.b.c.d, a.b.x.y;
to completely replace the servers returned via dhcp.
Except you have to spell it the correct way, which is "supersede".
(Doing a "strings /sbin/dhclient | grep super" indicates that it doesn't accept the common misspelling "supercede" as an alternative)
Cheers Tony
On Tue, 27 Sep 2011 11:04:42 -0400 Alfred von Campe wrote:
Most of my desktops are still running CentOS5, but I have installed CentOS6 on a few of them. The users on those desktops are reporting that DNS lookups are slow, and from my brief tests, that does appear to be the case.
Have you considered installing dnsmasq on those machines?