Hello all, I'm currently experiencing an issue with an NFS server I've built (a Dell R710 with a Dell PERC H800/LSI 2108 and four external disk trays). It's a backup target for Solaris 10, CentOS 5.5 and CentOS 6.2 servers that mount it's data volume via NFS. It has two 10gig NICs set up in a layer2+3 bond for one network, and two more 10gig NICs set up in the same way in another network. The host has a 99T XFS filesystem for the backups. RPCNFSDCOUNT is set to 256.
During backups from clients the system exhibits odd hangs that interfere with some of our sensitive system's backup windows. On the NFS server side we see the following in dmesg. Originally I thought it was related to dirty writeback cache, but I adjusted dirty_writeback_centisecs and am still seeing the issue.
dmesg during the problem window: Mar 16 07:01:21 *****store01 kernel: __ratelimit: 11 callbacks suppressed Mar 16 07:01:21 *****store01 kernel: nfsd: page allocation failure. order:2, mode:0x20 Mar 16 07:01:21 *****store01 kernel: Pid: 6041, comm: nfsd Not tainted 2.6.32-220.4.2.el6.x86_64 #1 Mar 16 07:01:21 *****store01 kernel: Call Trace: Mar 16 07:01:21 *****store01 kernel: <IRQ> [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100 Mar 16 07:01:21 *****store01 kernel: [<ffffffff81461e65>] ? ip_rcv+0x275/0x350 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180 Mar 16 07:01:21 *****store01 kernel: [<ffffffff814219e6>] ? __netdev_alloc_skb+0x36/0x60 Mar 16 07:01:21 *****store01 kernel: [<ffffffffa0188104>] ? ixgbe_alloc_rx_buffers+0x2c4/0x380 [ixgbe] Mar 16 07:01:21 *****store01 kernel: [<ffffffff8127f980>] ? swiotlb_map_page+0x0/0x100 Mar 16 07:01:21 *****store01 kernel: [<ffffffffa0189158>] ? ixgbe_clean_rx_irq+0x818/0x8b0 [ixgbe] Mar 16 07:01:21 *****store01 kernel: [<ffffffffa01895ff>] ? ixgbe_clean_rxtx_many+0x10f/0x220 [ixgbe] Mar 16 07:01:21 *****store01 kernel: [<ffffffff814307c3>] ? net_rx_action+0x103/0x2f0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff810d9390>] ? handle_IRQ_event+0x60/0x170 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8107205a>] ? __do_softirq+0x11a/0x1d0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff81071de5>] ? irq_exit+0x85/0x90 Mar 16 07:01:21 *****store01 kernel: [<ffffffff814f4c85>] ? do_IRQ+0x75/0xf0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11 Mar 16 07:01:21 *****store01 kernel: <EOI> [<ffffffff8105673f>] ? finish_task_switch+0x4f/0xe0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff814ec9ce>] ? thread_return+0x4e/0x760 Mar 16 07:01:21 *****store01 kernel: [<ffffffff81123741>] ? __alloc_pages_nodemask+0x111/0x940 Mar 16 07:01:21 *****store01 kernel: [<ffffffff814ed7b2>] ? schedule_timeout+0x192/0x2e0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8107c0a0>] ? process_timeout+0x0/0x10 Mar 16 07:01:21 *****store01 kernel: [<ffffffffa0319415>] ? svc_recv+0x5a5/0x850 [sunrpc] Mar 16 07:01:21 *****store01 kernel: [<ffffffff8105e7f0>] ? default_wake_function+0x0/0x20 Mar 16 07:01:21 *****store01 kernel: [<ffffffffa03fcb45>] ? nfsd+0xa5/0x160 [nfsd] Mar 16 07:01:21 *****store01 kernel: [<ffffffffa03fcaa0>] ? nfsd+0x0/0x160 [nfsd] Mar 16 07:01:21 *****store01 kernel: [<ffffffff81090726>] ? kthread+0x96/0xa0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Mar 16 07:01:21 *****store01 kernel: [<ffffffff81090690>] ? kthread+0x0/0xa0 Mar 16 07:01:21 *****store01 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
xfs_info: # xfs_info /data meta-data=/dev/sdb1 isize=256 agcount=99, agsize=268435200 blks = sectsz=512 attr=2 data = bsize=4096 blocks=26367491584, imaxpct=1 = sunit=256 swidth=9216 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=8 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
cat /etc/centos-release: CentOS release 6.2 (Final)
uname -a: Linux *****store01 2.6.32-220.4.2.el6.x86_64 #1 SMP Tue Feb 14 04:00:16 GMT 2012 x86_64 x86_64 x86_64 GNU/Linux
lspci output: 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 13) 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13) 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13) 00:04.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 4 (rev 13) 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 13) 00:06.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 6 (rev 13) 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13) 00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 13) 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13) 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13) 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13) 00:1a.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02) 00:1a.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02) 00:1a.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02) 00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02) 00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA Controller [IDE mode] (rev 02) 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) 04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) 05:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) 05:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) 06:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) 08:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
/etc/sysctl.conf changes: net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 2621440 16777216 net.ipv4.tcp_wmem = 4096 2621440 16777216 net.core.netdev_max_backlog = 250000 net.ipv4.route.flush = 1 net.ipv4.tcp_window_scaling = 1 vm.dirty_writeback_centisecs = 50
Has anyone else seem similar issues? I can provide additional details about the server/configuration if anybody needs anything else. The issue only seems to occur under high write load as we've restored some of these backups and didn't seem to have an issue reading the data.
Thanks all, -Aaron