Never mind, commit 1a040eaca1a2 (bridge: fix multicast router rlist endless loop) fixes it.
-----原始邮件----- 发件人: wuzhouhui wuzhouhui14@mails.ucas.ac.cn 发送时间: 2018-01-10 15:19:09 (星期三) 收件人: centos@centos.org 抄送: wuzhouhui14 wuzhouhui14@mails.ucas.ac.cn 主题: soft lockup after set multicast_router of bridge and it's port to 2
OS: CentOS 6.5.
After I set multicast_router of bridge and it's port to 2, like following: echo 2 > /sys/devices/virtual/net/eth81/bridge/multicast_router echo 2 > /sys/devices/virtual/net/bond2/brport/multicast_router Then soft lockup occured: Message from syslogd@node-0 at Jan 9 15:47:12 ... kernel:BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0] And the call trace is RIP: 0010:[<ffffffffa04f3608>] [<ffffffffa04f3608>] br_multicast_flood+0x88/0x140 [bridge] RSP: 0018:ffff88013bc038f0 EFLAGS: 00000246 RAX: ffff88404f816020 RBX: ffff88013bc03940 RCX: ffff88204e40a640 RDX: ffff882002b9ce01 RSI: ffff882002b9ce80 RDI: 0000000000000000 RBP: ffffffff8100bb93 R08: 0000000000000001 R09: 00000000ff09f4a1 R10: ffff88202c884070 R11: 0000000000000000 R12: ffff88013bc03870 R13: ffff882002b9ce80 R14: ffff88013bc03860 R15: ffffffff8151b225 FS: 0000000000000000(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00007fa11a942000 CR3: 0000000001a85000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020) Stack: 880be7e100028813 ffff882002b9ce80 ffff882002b9ce80 ffffffffa04f3930 <d> 00000000880be7e1 ffff882002b9ce80 ffff882002b9ce80 ffff88200286c042 <d> ffff88202ae7c6e0 ffff882002b9ceb8 ffff88013bc03950 ffffffffa04f36d5 Call Trace: <IRQ> [<ffffffffa04f3930>] ? __br_forward+0x0/0xd0 [bridge] [<ffffffffa04f36d5>] ? br_multicast_forward+0x15/0x20 [bridge] [<ffffffffa04f4a34>] ? br_handle_frame_finish+0x144/0x2a0 [bridge] [<ffffffffa04fa938>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge] [<ffffffffa04faedb>] ? br_nf_pre_routing+0x48b/0x7b0 [bridge] [<ffffffff8143ba57>] ? __kfree_skb+0x47/0xa0 [<ffffffff814734f9>] ? nf_iterate+0x69/0xb0 [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] [<ffffffff814736b6>] ? nf_hook_slow+0x76/0x120 [<ffffffffa04f48f0>] ? br_handle_frame_finish+0x0/0x2a0 [bridge] [<ffffffffa04f4d1c>] ? br_handle_frame+0x18c/0x250 [bridge] [<ffffffff81445709>] ? __netif_receive_skb+0x529/0x750 [<ffffffff814397da>] ? __alloc_skb+0x7a/0x180 [<ffffffff814492f8>] ? netif_receive_skb+0x58/0x60 [<ffffffff81449400>] ? napi_skb_finish+0x50/0x70 [<ffffffff8144ab79>] ? napi_gro_receive+0x39/0x50 [<ffffffffa016887f>] ? bnx2x_rx_int+0x83f/0x1630 [bnx2x] [<ffffffff810608dc>] ? perf_event_task_sched_out+0x4c/0x70 [<ffffffffa01698ae>] ? bnx2x_poll+0x23e/0x2f0 [bnx2x] [<ffffffff8144ac93>] ? net_rx_action+0x103/0x2f0 [<ffffffff8107a811>] ? __do_softirq+0xc1/0x1e0 [<ffffffff810e6b30>] ? handle_IRQ_event+0x60/0x170 [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 [<ffffffff8107a6c5>] ? irq_exit+0x85/0x90 [<ffffffff8151b165>] ? do_IRQ+0x75/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 <EOI> [<ffffffff81016627>] ? mwait_idle+0x77/0xd0 [<ffffffff815176fa>] ? atomic_notifier_call_chain+0x1a/0x20 [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 [<ffffffff814f6e3a>] ? rest_init+0x7a/0x80 [<ffffffff81c25f70>] ? start_kernel+0x405/0x411 [<ffffffff81c2533a>] ? x86_64_start_reservations+0x125/0x129 [<ffffffff81c25453>] ? x86_64_start_kernel+0x115/0x124
Does anyone know the reason?