[CentOS] NFS / DNS problem

Wed Aug 15 08:50:44 UTC 2007
Simone Mangelio <dezmodue at gmail.com>

Hi Peter

Thanks for your reply.

Some more info:

/etc/resolv.conf on ns1
nameserver ns0IP
nameserver ns1IP

At the time ns0 was down, I can see that even ns1 fails mounting the nfs
shares (timed out):

Aug 14 08:30:31 ns1 automount[4093]: >> mount: mount to NFS server 'nfs-web'
failed: timed out (retrying).
Aug 14 08:31:53 ns1 last message repeated 2 times
Aug 14 08:32:13 ns1 automount[4093]: >> mount: mount to NFS server 'nfs-web'
failed: timed out (giving up).

If I go back in the logs I can see a full zone synch happening on the 2nd of
August, no chnages have been made after that so I am pretty confident the
zones were ok.

In what way reverse lookup would affect it?

We are still scratching our heads.....

Thanks

Simone

On 8/15/07, Peter (CentOS List) <centos at ourvirtualhome.com> wrote:
>
> The first thing that popped in my head was reverse lookup, but as I kept
> reading and saw your test with web3 it could ave been a sync problem
> between the two nameservers. By restarting ns1 all the zones were synced
> again and your initial problem isn't there anymore and so your test with
> web3 was successful as in it didn't loose it's mount. Keep an eye on ns1
> when you make updates in the zones on ns0. I have seen problems where
> the sync didn't occur automatically and I had to sync "manually" by
> stopping and starting bind on the secondary server.
>
> Hope it helps you a little bit.
>
> Peter
>
> Simone wrote:
> > Hi all,
> >
> > Today we have had a strange problem that has taken down our website, we
> > understand what happened but not why so I am hoping someone has seen
> > this before.
> >
> > We have our web servers (web1 web2 web3 ..... web10) mounting an NFS
> > share (/export/data) from server nfs1. On the web server side we use
> > autofs in the format nfs-dedicated:/export/data where nfs-dedicated is
> > an alias in our internal DNS servers pointing to server nfs1. We run a
> > primary and a secondary DNS (bind) server ns0, ns1 authoritative for our
> > zones and our webservers have them configured in /etc/resolv.conf
> > Today we had to run some upgrade on the dns servers (bios firmwares etc)
> > so we took down ns0 and with it our website went down.
> > All the nfs shares disappeared from the web servers (the logs show
> > requests to mount/unmount timing out), but at the same time on nfs1 the
> > logs show requests (mount and unmount) coming from the web servers and
> > no errors.
> >
> > As soon as ns0 is back up, all gets back to normal. Minutes later we
> > take down ns1 for maintenance and it doesn't have any impact on the
> > website.
> >
> > dig @ns0 nfs-web gives exactly the same results on ns0/1
> >
> > Back to the office we try to reproduce the same scenario configuring
> > iptables on web3 to block traffic to ns0 but the server (web3) keeps
> > working fine reverting to ns1 for name resolution (as you would expect).
> >
> > Has anybody seen this happening before? Any comment/suggestion much
> > appreciated.
> >
> > Thanks
> >
> > Simone
> >
> > _______________________________________________
> > CentOS mailing list
> > CentOS at centos.org
> > http://lists.centos.org/mailman/listinfo/centos
> >
> >
> >
>
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos/attachments/20070815/ed2e3628/attachment-0005.html>