[CentOS] NFS client kernel panic

Thu Sep 10 08:38:20 UTC 2015
David Goudet <david.goudet at lyra-network.com>

Hi everyone,

I have an issue with NFS client when the NFS client loose connection with the NFS server.
In certain conditions the NFS client freeze and the result is kernel panic. 
It seems that kernel panic occurs when the NFS client try to unmount the unreachable remote NFS partition.

This case occurs about one time each two days.

CentOS version: CentOS Linux release 7.1.1503 (Core)
Kernel version: Linux foo.bar 3.10.0-229.4.2.el7.x86_64 #1 SMP Wed May 13 10:06:09 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
NFS packages: nfs-utils-1.3.0-0.8.el7.x86_64, libnfsidmap-0.25-11.el7.x86_64

This issue is present only on Centos7 NFS client (not present with Centos6).

Any ideas? Is it known problem with Centos7 NFS client? 

Logs and stack trace here after:

nfs: server foo.bar not responding, timed out
nfs: server foo.bar not responding, timed out
nfs: server foo.bar not responding, timed out
nfs: server foo.bar not responding, timed out
nfs: server foo.bar not responding, timed out
BUG: Dentry ffff8800ca82f380{i=138c0,n=foo.png} still in use (2) [unmount of nfs4 0:41]
------------[ cut here ]------------
kernel BUG at fs/dcache.c:945!
invalid opcode: 0000 [#1] SMP 
Modules linked in: ipt_REJECT rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache binfmt_misc xt_multiport iptable_filter ip_tables dm_crypt nfsd mgag200 auth_rpcgss syscopyarea nfs_acl sysfillrect lockd sysimgblt i2c_algo_bit ttm coretemp sunrpc drm_kms_helper iTCO_wdt iTCO_vendor_support i7core_edac ipmi_devintf lpc_ich kvm drm bnx2 ipmi_si dcdbas crc32c_intel serio_raw pcspkr i2c_core mfd_core edac_core ipmi_msghandler shpchp acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_common sr_mod cdrom ata_generic pata_acpi mptsas scsi_transport_sas ata_piix mptscsih libata mptbase dm_mirror dm_region_hash dm_log dm_mod
CPU: 12 PID: 21364 Comm: umount.nfs4 Not tainted 3.10.0-229.4.2.el7.x86_64 #1
Hardware name: Dell Inc. PowerEdge R410/0N051F, BIOS 1.2.4 11/02/2009
task: ffff880075b3ad80 ti: ffff88007744c000 task.ti: ffff88007744c000
RIP: 0010:[<ffffffff811dddcc>]  [<ffffffff811dddcc>] shrink_dcache_for_umount_subtree+0x1ac/0x1c0
RSP: 0018:ffff88007744fe10  EFLAGS: 00010246
RAX: 000000000000005e RBX: ffff8800ca82f380 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88022fccd488 RDI: 0000000000000246
RBP: ffff88007744fe28 R08: 0000000000000096 R09: 0000000000000627
R10: 0000000000000000 R11: ffff88007744fb26 R12: ffff8801ffb08780
R13: ffffffffa04f27a0 R14: ffff88021f3c0f40 R15: ffff88021f3c0f20
FS:  00007ff58b134880(0000) GS:ffff88022fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff58aad15a0 CR3: 00000001ffe7f000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
ffff880036a9fb20 ffff880036a9f800 ffff880036a9a000 ffff88007744fe40
ffffffff811df969 ffff880036a9f800 ffff88007744fe68 ffffffff811c8fe1
0000000000000029 ffff880036a9a000 ffff88021f3c0f20 ffff88007744fe80
Call Trace:
[<ffffffff811df969>] shrink_dcache_for_umount+0x49/0x60
[<ffffffff811c8fe1>] generic_shutdown_super+0x21/0xe0
[<ffffffff811c9282>] kill_anon_super+0x12/0x20
[<ffffffffa041827b>] nfs_kill_super+0x1b/0x30 [nfs]
[<ffffffff811c962d>] deactivate_locked_super+0x3d/0x60
[<ffffffff811c9c36>] deactivate_super+0x46/0x60
[<ffffffff811e6ac5>] mntput_no_expire+0xc5/0x120
SELinux: initialized (dev 0:40, type nfs4), uses genfs_contexts
[<ffffffff811e7bff>] SyS_umount+0x9f/0x3c0
[<ffffffff81614de9>] system_call_fastpath+0x16/0x1b
Code: 00 00 48 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 1b 48 8b 50 40 48 89 34 24 48 c7 c7 20 f1 83 81 48 89 de 31 c0 e8 f0 0a 42 00 <0f> 0b 31 d2 eb e5 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 
RIP  [<ffffffff811dddcc>] shrink_dcache_for_umount_subtree+0x1ac/0x1c0
RSP <ffff88007744fe10>
---[ end trace ae487f589c43fd74 ]---
Kernel panic - not syncing: Fatal exception

Thank you for you attention