[CentOS] Re: 10Gbit ethernet

Fri Mar 14 17:19:54 UTC 2008
Chris Payne <Chris.Payne at triumf.ca>

On Fri, Mar 14, 2008 at 11:06:18AM +0000, Jake Grimmett wrote:
> I'm probably not going to feed 10Gb to the individual blades, as I have few 
> MPI sers, though it's an option with IBM and HP blades. However IBM, and 
> Dell offer a 10Gbit XFP uplink to the blade servers internal switch, and 
> this has to be worthwhile with 56 CPUs on the other side of it.
> 
> I'm most concerned about whether anyone has tried the Netxen or Chelsio 10Gbit 
> NICs on Centos 5.1; I see drivers in /lib/modules for these...
> 
> Also - do people have good / bad experiences of CX4 cabling? As an economical 
> short range solution (<15M) it seems ideal for a server room, but I have a 
> sales rep who is trying to scare me off, but he is biased as the 10Gb SR XFP 
> transceivers are very expensive (~£820)...

Jake--

Although not completely authoritative, I can share with you our recent 
experience with a similar setup here at the ATLAS Tier-1 at TRIUMF.

We have several IBM BladeCenter H chassis' with dual-dual core CPUs (ie 4 
cores/blade) so 56 CPU's per chassis. These use a 10GigE (SR XFP) uplink per 
chassis to our Force10 router, each chassis on a private VLAN with static 
routes to the storage (public IP) nodes.

Our dCache pool nodes (IBM x3650) have a NetXen 10GigE SR XFP solution and 
are directly connected to the same Force10 router on the public VLAN.  Since 
we are on SL4.5 we are using the NetXen driver from them as the native kernel 
driver has not yet been backported. (or has it now?) 

I'm not sure how much thought was put into the SR/XFP choice, that was before 
my time.

Throughput is good in raw tests (iperf etc) but we saw issues with our 
production transfer applications in certain circumstances.  Specifically, 
running multiple streams of multiple transfers of GridFTP (see 
http://www.globus.org/toolkit/data/gridftp/).  I think 10 transfers with 10 
streams (not my department) would cause the network card to "lock up" and 
connectivity was completely lost. Generally, this took a matter of minutes to 
accomplish. 

Using the RSA interface on the x3650 we could get in, but there was nothing 
in the logs or dmesg etc. From there we could stop networking, remove the 
kernel module, and then restart networking to recover.  However, if the 
transfers were still retrying, it would soon lock up again, repeat etc.  
Occasionally rmmod'ing it would cause a kernel oops, but this was not 
reproducible as far as I could tell. If the transfers were killed, the 
machine generally recovers.

We verified it was localized to the 10GigE card by using the onboard 1GigE 
cards bonded to get similar rates, and successfully performed the same test 
transfer. 

Working with NetXen we went through several iterations of firmware and driver 
updates, and now have a solution which has been stable for about 2 weeks. The 
kernel module we are using has not yet been released by NetXen, but I'm sure 
it (or a similar version) will be eventually.

Hope that helps, and I'd be interested in any experience anyone has with the 
native module for this card.

Cheers
Chris
--
Chris Payne			chris.payne at triumf.ca
TRIUMF ATLAS Tier-1 System Administrator - Networking
TRIUMF				+1 604 222 7554
4004 Wesbrook Mall, Vancouver, BC, V6T2A3, CANADA