Anyone have experience with any? I've been having a real hard time finding info on any cards that actually support this under linux. (most of the cards work but I don't see drivers that actually offload the TCP stack)
I have seen some comments where kernel developers don't like the idea as well.
I'm running CentOS 4.6 64-bit on a dual proc quad core system with 8GB memory.
I have a situation where I'm trying to make the most of my hardware here, a web application that serves up hundreds of tiny requests per second(each taking typically sub 100ms to complete).
With my load balancer in it's "normal" mode I can pump about 1000 requests a second through the system, before exhausting the TCP stack on the server(all ~65k sockets in use). CPU usage tops out at about 75%. The "normal" mode terminates idle connections forcefully after 25 seconds.
If I tell my load balancer to terminate idle connections *immediately* instead of waiting 25 seconds, I get similar CPU usage but transactions per second drop by about 30-33%.
During my tests I've also gotten 4 kernel panics, the last two I have captured and they are virtually identical, with the exception of one time it panic's on the java process, another time it panics on the swapper process. Both times the load balancer was configured to terminate connections faster than 25 seconds. I've done several searches and can't find anything remotely related to the below panic.
Kernel BUG at tcp_output:943 invalid operand: 0000 [1] SMP CPU 2 Modules linked in: dell_rbu md5 ipv6 autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc dm_mirror dm_mod joydev button battery ac uhci_hcd ehci_hcd hw_random shpchp bnx2 ext3 jbd ata_piix libata megaraid_sas sd_mod scsi_mod Pid: 10645, comm: java Not tainted 2.6.9-67.ELsmp RIP: 0010:[<ffffffff802e1a12>] <ffffffff802e1a12>{tcp_retransmit_skb+639} RSP: 0000:0000010006963e98 EFLAGS: 00010202 RAX: 00000101687cd9c0 RBX: 00000102266d2a80 RCX: 00000101687cd9c0 RDX: 00000101dec72e80 RSI: 0000000000000350 RDI: 0000000000000010 RBP: 000001015c0f7700 R08: 0000000000000008 R09: 0000000000000100 R10: 0000000000000000 R11: 0000000000000008 R12: 000001007b64ccb0 R13: 00000000000005b4 R14: 000001007b64ccb0 R15: 000001007b64c980 FS: 0000000048fe0960(005b) GS:ffffffff804f2e80(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000002adc82e028 CR3: 000000000694e000 CR4: 00000000000006e0 Process java (pid: 10645, threadinfo 0000010055cba000, task 000001005633c7f0) Stack: 0000010006963ec8 0000020000102a93 000001007b64ccb0 0000000000000001 000001007b64c980 0000000000000008 000001007b64ccb0 000001007b64ccb0 0000002c273fc4d0 ffffffff802e3f6d Call Trace:<IRQ> <ffffffff802e3f6d>{tcp_write_timer+1059} <ffffffff802e3b4a>{tcp_write_timer+0} <ffffffff80140945>{run_timer_softirq+356} <ffffffff8013cff0>{__do_softirq+88} <ffffffff8013d099>{do_softirq+49} <ffffffff80110bf5>{apic_timer_interrupt+133} <EOI>
Code: 0f 0b 48 c1 34 80 ff ff ff ff af 03 48 8b 43 10 4c 63 f6 ff RIP <ffffffff802e1a12>{tcp_retransmit_skb+639} RSP <0000010006963e98> <0>Kernel panic - not syncing: Oops
From what I can tell the Broadcom chips in this server support TOE
but the driver in linux does not implement it. I'd like to find a card that really has TOE in linux, if for nothing else to compare how it performs vs what I have now. With the goal of pumping out the highest number of transactions per second possible with the hardware/power utilization.
thanks
nate
On Fri, 13 Jun 2008 at 10:52am, nate wrote
Anyone have experience with any? I've been having a real hard time finding info on any cards that actually support this under linux. (most of the cards work but I don't see drivers that actually offload the TCP stack)
I have seen some comments where kernel developers don't like the idea as well.
A pretty good discussion of this just occurred over on the beowulf mailing list. See http://marc.info/?t=121079210300009&r=1&w=2.
Joshua Baker-LePain wrote:
A pretty good discussion of this just occurred over on the beowulf mailing list. See http://marc.info/?t=121079210300009&r=1&w=2.
Thanks! It seems I should just stop looking, not cost effective for my purposes. Sigh
nate