NFS mount on Centos 7 crashing - Discuss

2 Jun 2017


      Hello,
We have a VM (under KVM - a VPS service by our ISP) running CentOS 7.
On it we have 2 NFS mounts, one for backup and one as a live file system 
(where there are two user homes as well):
-----------------------------------------------------------------------------------------------------------------------
# cat /etc/fstab
/dev/mapper/centos-root /                       xfs defaults        0 0
UUID=7a3ae70a-8ef3-463b-8f5b-be4e2e7be894 /boot xfs defaults        0 0
/dev/mapper/centos-swap swap                    swap defaults        0 0
10.201.40.34:/data/col1/noc-bkups-1      /mnt/dd2500-1 nfs 
auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0
10.201.40.34:/data/col1/hesperia-mount   /hesperiamount nfs 
auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0
-----------------------------------------------------------------------------------------------------------------------
This setup has been working fine for over a year, even under significant 
load, without issues.
However, yesterday, the "live" NFS mount (/hesperiamount) has started 
crashing. When bootingeverything is fine, but very soonafter boot we 
noticed that we lose communication to the mount, although the remote 
storage system is accessible(without reporting any errors) and no 
network issues have occurred. We found that dmesg reports failures with 
call traces (2 examples):
https://pastebin.com/GVSDbxFr
https://pastebin.com/WujKQuHG
This happens repeatedly/consistently (after several reboots) so we have 
been forced to replace the NFS mount with a local mount (on a new local 
virtual hard disk), to restore normal system operation. So the fstab has 
now become:
-------------------------------------------------------------------------------------------------------------------
# cat /etc/fstab
/dev/mapper/centos-root /                        xfs defaults        0 0
UUID=7a3ae70a-8ef3-463b-8f5b-be4e2e7be894 /boot  xfs defaults        0 0
/dev/mapper/centos-swap swap                     swap defaults        0 0
/dev/mapper/vg2-lv1     /hesperiamount           xfs defaults        0 0
10.201.40.34:/data/col1/noc-bkups-1      /mnt/dd2500-1   nfs 
auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0
# 10.201.40.34:/data/col1/hesperia-mount   /hesperiamount  nfs 
auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0
-------------------------------------------------------------------------------------------------------------------
Note that when I later mounted manually the same NFS share on the same 
box (in order to copy data from it using rsync), it did not crash (but 
it only had reads and no writes in this scenario). The share was 
manually mounted with the following command:
# mount -vv -o auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 -t 
nfs 10.201.40.34:/data/col1/hesperia-mount /hesperiamount2
Questions:
* Is this a known issue/bug?
  * Have we possibly made any NFS misconfigurations (which however have
    not caused any errors for about a year now)?
  * What could we do to prevent the error from occurring again?
Please advise.
Thanks,
Nick