Hi Peter Thanks for your reply. Some more info: /etc/resolv.conf on ns1 nameserver ns0IP nameserver ns1IP At the time ns0 was down, I can see that even ns1 fails mounting the nfs shares (timed out): Aug 14 08:30:31 ns1 automount[4093]: >> mount: mount to NFS server 'nfs-web' failed: timed out (retrying). Aug 14 08:31:53 ns1 last message repeated 2 times Aug 14 08:32:13 ns1 automount[4093]: >> mount: mount to NFS server 'nfs-web' failed: timed out (giving up). If I go back in the logs I can see a full zone synch happening on the 2nd of August, no chnages have been made after that so I am pretty confident the zones were ok. In what way reverse lookup would affect it? We are still scratching our heads..... Thanks Simone On 8/15/07, Peter (CentOS List) <centos at ourvirtualhome.com> wrote: > > The first thing that popped in my head was reverse lookup, but as I kept > reading and saw your test with web3 it could ave been a sync problem > between the two nameservers. By restarting ns1 all the zones were synced > again and your initial problem isn't there anymore and so your test with > web3 was successful as in it didn't loose it's mount. Keep an eye on ns1 > when you make updates in the zones on ns0. I have seen problems where > the sync didn't occur automatically and I had to sync "manually" by > stopping and starting bind on the secondary server. > > Hope it helps you a little bit. > > Peter > > Simone wrote: > > Hi all, > > > > Today we have had a strange problem that has taken down our website, we > > understand what happened but not why so I am hoping someone has seen > > this before. > > > > We have our web servers (web1 web2 web3 ..... web10) mounting an NFS > > share (/export/data) from server nfs1. On the web server side we use > > autofs in the format nfs-dedicated:/export/data where nfs-dedicated is > > an alias in our internal DNS servers pointing to server nfs1. We run a > > primary and a secondary DNS (bind) server ns0, ns1 authoritative for our > > zones and our webservers have them configured in /etc/resolv.conf > > Today we had to run some upgrade on the dns servers (bios firmwares etc) > > so we took down ns0 and with it our website went down. > > All the nfs shares disappeared from the web servers (the logs show > > requests to mount/unmount timing out), but at the same time on nfs1 the > > logs show requests (mount and unmount) coming from the web servers and > > no errors. > > > > As soon as ns0 is back up, all gets back to normal. Minutes later we > > take down ns1 for maintenance and it doesn't have any impact on the > > website. > > > > dig @ns0 nfs-web gives exactly the same results on ns0/1 > > > > Back to the office we try to reproduce the same scenario configuring > > iptables on web3 to block traffic to ns0 but the server (web3) keeps > > working fine reverting to ns1 for name resolution (as you would expect). > > > > Has anybody seen this happening before? Any comment/suggestion much > > appreciated. > > > > Thanks > > > > Simone > > > > _______________________________________________ > > CentOS mailing list > > CentOS at centos.org > > http://lists.centos.org/mailman/listinfo/centos > > > > > > > > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20070815/ed2e3628/attachment-0005.html>