CentOS 4.1/bind-9.2.4-2.
I have named serving as a cache DNS server plus SOA for a local intranet zone.
The problem I am encountering - over a period of time it stops responding to queries.
nmap scan from a different host shows port 53 is visible. I can telnet to the port but all queries to server time out. So much so that "service named status" and "service named restart" hang. I have to manually kill the named process before I am able to start named again (I do remove the lock/pid files manually as well). This has occurred about 4 times since I installed CentOS 4.1 4 weeks ago. I have not encountered any problem with other services running on the same server.
I looked through /var/log/messages and did not find any errors logged by named. I'd appreciate any thoughts/suggestions to debug this problem.
Here is what I have tried so far to figure out the problem:
(from 192.168.1.150) $ host www.yahoo.com 192.168.1.21 ;; connection timed out; no servers could be reached
# nmapfe of 192.168.1.21 (from 192.168.1.150) (The 1208 ports scanned but not shown below are in state: closed) PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 53/tcp open domain
(ssh'd into named server using IP# 192.168.1.21) # service named status rndc: recv failed: operation canceled
TIA,
On Wed, 2005-08-24 at 10:34, Arun K. Khan wrote:
CentOS 4.1/bind-9.2.4-2.
I have named serving as a cache DNS server plus SOA for a local intranet zone.
The problem I am encountering - over a period of time it stops responding to queries.
(from 192.168.1.150) $ host www.yahoo.com 192.168.1.21 ;; connection timed out; no servers could be reached # nmapfe of 192.168.1.21 (from 192.168.1.150) (The 1208 ports scanned but not shown below are in state: closed) PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 53/tcp open domain (ssh'd into named server using IP# 192.168.1.21) # service named status rndc: recv failed: operation canceled
It looks like it can't reach the root servers. It has a private address - could you have a problem with your NAT gateway to the internet? How about your local firewalling on 53/udp to let the responses back?
On 8/24/05, Les Mikesell lesmikesell@gmail.com wrote:
On Wed, 2005-08-24 at 10:34, Arun K. Khan wrote:
CentOS 4.1/bind-9.2.4-2.
I have named serving as a cache DNS server plus SOA for a local intranet zone.
The problem I am encountering - over a period of time it stops responding to queries.
(from 192.168.1.150) $ host www.yahoo.com 192.168.1.21 ;; connection timed out; no servers could be reached # nmapfe of 192.168.1.21 (from 192.168.1.150) (The 1208 ports scanned but not shown below are in state: closed) PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 53/tcp open domain (ssh'd into named server using IP# 192.168.1.21) # service named status rndc: recv failed: operation canceled
It looks like it can't reach the root servers. It has a private address - could you have a problem with your NAT gateway to the internet? How about your local firewalling on 53/udp to let the responses back?
For DNS servers 53/tcp is required as well. UDP handles most requests but when the amount of data is great enough it uses TCP. IIRC yahoo returns quite a few.
On Wed, 2005-08-24 at 11:45 -0500, Les Mikesell wrote:
It looks like it can't reach the root servers. It has a private address - could you have a problem with your NAT gateway to the internet? How about your local firewalling on 53/udp to let the responses back?
No it is neither a NAT gateway nor a firewall problem. In fact firewall is turned off on this box and I can fire up named on my SuSE workstation and it works fine. With the CentOS server, I don't get any response for queries on my 'local' zone for which it is the SOA.
Interestingly, 53/udp does not show up in 'netstat -n -l -t -u -p ' or the nmap scan when this situation occurs. This could well be the problem.
Arun K. Khan wrote:
I looked through /var/log/messages and did not find any errors logged by named. I'd appreciate any thoughts/suggestions to debug this problem.
Hmmm, have you used 'lsof' or 'strace' to see what the process is doing? First get the process ID of the named process (pgrep named), you'll need this for the other two. 'lsof -p {process ID}' will show you all of the file handles the process has open, including any shared libraries and network connections.
'strace -p {process ID}' will allow you to watch the system calls the process is executing as it is running. If it's just sitting there, it might be tough to figure out, but if you watch it and see that it is looping on some call that might give you a hint.
You can also start browsing through the /proc filesystem for information about that process, start by getting a directory listing of '/proc/{process ID}/'.
Just a thought!
I've encountered the same problems using the 4.1 SRPMs rebuilt for a RH9 machine I haven't had a chance to port to Centos 4. I've made about as much progress on the issue as you have. I've got pretty extensive logging in named turned on and queries just stop running at a certain point and it hangs until I kill -9 the process.
----- Original Message ----- From: "Arun K. Khan" knura@yahoo.com To: "CentOS mailing list" centos@centos.org Sent: Wednesday, August 24, 2005 10:34 AM Subject: [CentOS] named is up but does not respond to queries
CentOS 4.1/bind-9.2.4-2.
I have named serving as a cache DNS server plus SOA for a local intranet zone.
The problem I am encountering - over a period of time it stops responding to queries.
nmap scan from a different host shows port 53 is visible. I can telnet to the port but all queries to server time out. So much so that "service named status" and "service named restart" hang. I have to manually kill the named process before I am able to start named again (I do remove the lock/pid files manually as well). This has occurred about 4 times since I installed CentOS 4.1 4 weeks ago. I have not encountered any problem with other services running on the same server.
I looked through /var/log/messages and did not find any errors logged by named. I'd appreciate any thoughts/suggestions to debug this problem.
Here is what I have tried so far to figure out the problem:
(from 192.168.1.150) $ host www.yahoo.com 192.168.1.21 ;; connection timed out; no servers could be reached # nmapfe of 192.168.1.21 (from 192.168.1.150) (The 1208 ports scanned but not shown below are in state: closed) PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 53/tcp open domain (ssh'd into named server using IP# 192.168.1.21) # service named status rndc: recv failed: operation canceled
TIA,
Arun Khan Linux is like a wigwam - no gates, no windows, apache inside
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos