[CentOS] boot-time NFS mount failures

Fri Apr 20 15:16:19 UTC 2012
Veli-Pekka Kestilä <centos at vpk.nu>

On 20.4.2012 17:27, Tilman Schmidt wrote:
> Am 19.04.2012 19:30, schrieb Veli-Pekka Kestilä:
>> On 19.4.2012 20:12, Tilman Schmidt wrote:
>>> backup:/home/backup/Oracle      /backup_nfs     nfs
>>>
>>> The last time this happened, I found a message on the console:
>>>
>>> mount: can't get address for backup
>>>
>>> So it seems that the failure was caused by the nameserver not being
>>> available yet. Unfortunately that message isn't saved to any logfile,
>>> so I cannot say if it was the same the previous times.
>> You could set in fstab ipaddress instead of the server name, so there is
>> no need for name lookup or you can put the ip and name in /etc/hosts
> So you say it's only the name lookup failure that's causing
> startup to proceed without the NFS mount? All other failures
> like host unreachable or NFS port not open would cause the
> system to wait and retry?
>
>> - If on separate machine you could write your own init script which
>> tests that the name resolution works and runs the oracle startup after that.
> It would have to go before the netfs service I think. That's
> the one which does the NFS mount. The oracle startup script
> runs after netfs, so all would be fine if netfs wouldn't exit
> without having mounted the NFS shares.
>
>> I would put the ip in hosts if the backup server has fixed ip-address.
>> If not then making special init-script could be the trick.
> My concern are possible other failure modes besides the
> name lookup.
> What happens if the IP address is available (hardcoded or
> via name resolution) but the NFS server is offline?
> What if the NFS server machine is online (say, pingable) but
> the NFS service doesn't listen (yet)?
> I have to make sure that in all these cases the Oracle
> processes do not get started until the NFS mount is
> available.
As you are telling there is multitude of things which can cause the 
nfs-server not to work. So if you want to be sure you should really 
invest on writing the init script. Way I would propose to do it, would 
be to put it after the netfs to replace the oracle's original init script.

It would then do all the necessary tests to see if the nfs is mounted 
correctly and ready to use. (It could even troubleshoot some of the 
problems like trying to remount nfs mounts) It would then call the 
original oracle init script when everything works. It could also leave 
this functionality on backgroud and let rest of the system to boot, so 
that you can log in and troubleshoot if necessary.

-vpk