Kenneth Porter wrote:
--On Friday, October 19, 2018 2:33 PM -0700 Elliott Balsley elliott@altsystems.com wrote:
I don't have a solution, but I wanted to point out this same hang happened to me recently with a Myricom 10Gb card. Apparently Myricom drivers do not support CentOS 7 smb connections, although HTTP traffic works fine. I solved it by switching to a different NIC.
The mount works fine for me. It's only the automount that hangs, and only since a few months ago.
I had it happen again today when my LetsEncrypt cert renewed and the dovecot (IMAP) server restarted. Dovecot checks all the mountpoints (in case any have mail folders on them) and hung on restart. I shelled in and ran df and it also hung. I logged in yet another session and tried to ls the mountpoint and that hung completing the directory name.
Here's what I see in /var/log/messages when dovecot hangs and I manually mount the shares from another shell session. SELinux is in permissive mode.
Oct 26 09:11:39 saruman systemd: Mounting NAS1 share 1... Oct 26 09:11:39 saruman systemd: Failed to expire automount, ignoring: No such device Oct 26 09:11:39 saruman systemd: Mounted NAS1 share 1. Oct 26 09:11:45 saruman kernel: INFO: task dovecot:831 blocked for more than 120 seconds. Oct 26 09:11:45 saruman kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 26 09:11:45 saruman kernel: dovecot D ffff9994adfa3f40 0 831 1 0x00000084 Oct 26 09:11:45 saruman kernel: Call Trace: Oct 26 09:11:45 saruman kernel: [<ffffffff85f1890c>] ? __schedule+0x41c/0xa20 Oct 26 09:11:45 saruman kernel: [<ffffffff85f18f39>] schedule+0x29/0x70 Oct 26 09:11:45 saruman kernel: [<ffffffff85f168a9>] schedule_timeout+0x239/0x2c0 Oct 26 09:11:45 saruman kernel: [<ffffffff858beb96>] ? finish_wait+0x56/0x70 Oct 26 09:11:45 saruman kernel: [<ffffffff85f16ff2>] ? mutex_lock+0x12/0x2f Oct 26 09:11:45 saruman kernel: [<ffffffff85ab4e00>] ?
<snip> Wait a minute: are you running IPv6? What we see is that if a system doesn't get its IPv6 address, NFSv4 goes preferentially for that, and if it has that, and looses it, it will *NOT* fall back to IPv4, but hangs.
mark