[CentOS] systemd automount of cifs share hangs

Fri Oct 26 19:25:00 UTC 2018
mark <m.roth at 5-cent.us>

Kenneth Porter wrote:
> --On Friday, October 19, 2018 2:33 PM -0700 Elliott Balsley
> <elliott at altsystems.com> wrote:
>
>> I don't have a solution, but I wanted to point out this same hang
>> happened to me recently with a Myricom 10Gb card.  Apparently Myricom
>> drivers do not support CentOS 7 smb connections, although HTTP traffic
>> works fine.  I solved it by switching to a different NIC.
>
> The mount works fine for me. It's only the automount that hangs, and only
>  since a few months ago.
>
> I had it happen again today when my LetsEncrypt cert renewed and the
> dovecot (IMAP) server restarted. Dovecot checks all the mountpoints (in
> case any have mail folders on them) and hung on restart. I shelled in and
>  ran df and it also hung. I logged in yet another session and tried to ls
>  the mountpoint and that hung completing the directory name.
>
> Here's what I see in /var/log/messages when dovecot hangs and I manually
> mount the shares from another shell session. SELinux is in permissive
> mode.
>
> Oct 26 09:11:39 saruman systemd: Mounting NAS1 share 1...
> Oct 26 09:11:39 saruman systemd: Failed to expire automount, ignoring: No
> such device Oct 26 09:11:39 saruman systemd: Mounted NAS1 share 1.
> Oct 26 09:11:45 saruman kernel: INFO: task dovecot:831 blocked for more
> than 120 seconds. Oct 26 09:11:45 saruman kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct 26 09:11:45 saruman kernel: dovecot         D ffff9994adfa3f40     0
> 831      1 0x00000084
> Oct 26 09:11:45 saruman kernel: Call Trace:
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f1890c>] ?
> __schedule+0x41c/0xa20
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f18f39>] schedule+0x29/0x70
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f168a9>]
> schedule_timeout+0x239/0x2c0 Oct 26 09:11:45 saruman kernel:
> [<ffffffff858beb96>] ? finish_wait+0x56/0x70
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f16ff2>] ?
> mutex_lock+0x12/0x2f Oct 26 09:11:45 saruman kernel: [<ffffffff85ab4e00>]
> ?
<snip>
Wait a minute: are you running IPv6? What we see is that if a system
doesn't get its IPv6 address, NFSv4 goes preferentially for that, and if
it has that, and looses it, it will *NOT* fall back to IPv4, but hangs.

      mark