If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
Jake --
Jake Grimmett wrote:
If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
I would seriously evaluate whether you need 10Gbe to the nodes or if using it prodiguously within the switching fabric can give you the performance you need.
10Gbe brings with it all kinds of other issues you may not have thought about, compatibility of drivers and stability being just one. With the increased speed comes increased CPU usage and interrupts as well as PCI bus bandwidth.
With such a large cluster I don't think 10Gbe on the nodes is really necessary as the bandwidth and load is spread out, but I agree you will need some very good switching and VLANing to make it perform at top speed.
I would post this question on one of the networking and switching mailing lists and I am sure you can solicit some good responses from someone that has a large HPCC.
-Ross
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.
Where abouts are you maxing out? I assume it's not to the nodes but rather to the NFS head. If that's the case, you might want to look at a different, somewhat more time tested approach of standing up additional NFS heads (as you already mentioned). Having said that, we are currently using 10g links between a pair of Cisco 6509E's and another pair to a set of Foundry FESX48s (SAN traffic). No real issues to speak of, but this is backbone and not to the server. Our busiest machine, network wise, is our backup server and we've just bound a pair of 1 gig links to it with great success.
Jake Grimmett wrote:
If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
Jake
On Thu, 13 Mar 2008 at 2:31pm, Jake Grimmett wrote
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
I've no experience with this myself (yet), but I'd highly recomend taking this question to the beowulf list http://www.beowulf.org/mailman/listinfo/beowulf -- you're rather likely to find folks in situations similar to yours.
Jake Grimmett schrieb:
If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
Jake
Jake,
please note that the latest RedHat/CentOS-5.1 kernels have NFS performance issues - if these are (partly) responsible for the bottleneck then you may want to install a kernel which fixes the issue.
Check out RedHat bugzilla 321111.
HTH,
Kay
On Friday 14 March 2008 10:43, Kay Diederichs wrote:
Jake Grimmett schrieb:
If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
Jake
Jake,
please note that the latest RedHat/CentOS-5.1 kernels have NFS performance issues - if these are (partly) responsible for the bottleneck then you may want to install a kernel which fixes the issue.
Check out RedHat bugzilla 321111.
HTH,
Kay
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Many thanks for your answers :)
I will certainly talk to the beowulf crowd, and I think your right that our bottleneck's are *partly* caused by NFS issues.
I'm probably not going to feed 10Gb to the individual blades, as I have few MPI sers, though it's an option with IBM and HP blades. However IBM, and Dell offer a 10Gbit XFP uplink to the blade servers internal switch, and this has to be worthwhile with 56 CPUs on the other side of it.
I'm most concerned about whether anyone has tried the Netxen or Chelsio 10Gbit NICs on Centos 5.1; I see drivers in /lib/modules for these...
Also - do people have good / bad experiences of CX4 cabling? As an economical short range solution (<15M) it seems ideal for a server room, but I have a sales rep who is trying to scare me off, but he is biased as the 10Gb SR XFP transceivers are very expensive (~£820)...
Many thanks
Jake
On Fri, Mar 14, 2008 at 11:06:18AM +0000, Jake Grimmett wrote:
I'm probably not going to feed 10Gb to the individual blades, as I have few MPI sers, though it's an option with IBM and HP blades. However IBM, and Dell offer a 10Gbit XFP uplink to the blade servers internal switch, and this has to be worthwhile with 56 CPUs on the other side of it.
I'm most concerned about whether anyone has tried the Netxen or Chelsio 10Gbit NICs on Centos 5.1; I see drivers in /lib/modules for these...
Also - do people have good / bad experiences of CX4 cabling? As an economical short range solution (<15M) it seems ideal for a server room, but I have a sales rep who is trying to scare me off, but he is biased as the 10Gb SR XFP transceivers are very expensive (~£820)...
Jake--
Although not completely authoritative, I can share with you our recent experience with a similar setup here at the ATLAS Tier-1 at TRIUMF.
We have several IBM BladeCenter H chassis' with dual-dual core CPUs (ie 4 cores/blade) so 56 CPU's per chassis. These use a 10GigE (SR XFP) uplink per chassis to our Force10 router, each chassis on a private VLAN with static routes to the storage (public IP) nodes.
Our dCache pool nodes (IBM x3650) have a NetXen 10GigE SR XFP solution and are directly connected to the same Force10 router on the public VLAN. Since we are on SL4.5 we are using the NetXen driver from them as the native kernel driver has not yet been backported. (or has it now?)
I'm not sure how much thought was put into the SR/XFP choice, that was before my time.
Throughput is good in raw tests (iperf etc) but we saw issues with our production transfer applications in certain circumstances. Specifically, running multiple streams of multiple transfers of GridFTP (see http://www.globus.org/toolkit/data/gridftp/). I think 10 transfers with 10 streams (not my department) would cause the network card to "lock up" and connectivity was completely lost. Generally, this took a matter of minutes to accomplish.
Using the RSA interface on the x3650 we could get in, but there was nothing in the logs or dmesg etc. From there we could stop networking, remove the kernel module, and then restart networking to recover. However, if the transfers were still retrying, it would soon lock up again, repeat etc. Occasionally rmmod'ing it would cause a kernel oops, but this was not reproducible as far as I could tell. If the transfers were killed, the machine generally recovers.
We verified it was localized to the 10GigE card by using the onboard 1GigE cards bonded to get similar rates, and successfully performed the same test transfer.
Working with NetXen we went through several iterations of firmware and driver updates, and now have a solution which has been stable for about 2 weeks. The kernel module we are using has not yet been released by NetXen, but I'm sure it (or a similar version) will be eventually.
Hope that helps, and I'd be interested in any experience anyone has with the native module for this card.
Cheers Chris -- Chris Payne chris.payne@triumf.ca TRIUMF ATLAS Tier-1 System Administrator - Networking TRIUMF +1 604 222 7554 4004 Wesbrook Mall, Vancouver, BC, V6T2A3, CANADA
Jake Grimmett wrote:
On Friday 14 March 2008 10:43, Kay Diederichs wrote:
Jake Grimmett schrieb:
If I could ask question about 10Gbit ethernet....
We have a 70 node cluster built on Centos 5.1, using NFS to mount user home areas. At the moment the network is a bottleneck, and it's going to get worse as we add another 112 CPUs in the form of two blade servers.
To help things breath better, we are considering building three new NFS servers, and connecting these and the blade servers directly via 10Gbit to a new core switch, the older nodes will stay with 1Gbit ethernet, but be given new switches uplinked via 10Gbit to the core switch.
Before I spend a great deal of money, can I ask if anyone here has experience of 10GBit? Dell, HP, Supermicro, IBM (etc..) seem to be pushing the NetXen PCIe cards, so I guess these drivers work...? As to media, although CX4 seems cheaper than optical, I hear the cabling is nasty. And is the magical fairy going to fix 10Gbit over cat6A anytime soon?
any thoughts appreciated.
Jake
Jake,
please note that the latest RedHat/CentOS-5.1 kernels have NFS performance issues - if these are (partly) responsible for the bottleneck then you may want to install a kernel which fixes the issue.
Check out RedHat bugzilla 321111.
HTH,
Kay
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Many thanks for your answers :)
I will certainly talk to the beowulf crowd, and I think your right that our bottleneck's are *partly* caused by NFS issues.
I'm probably not going to feed 10Gb to the individual blades, as I have few MPI sers, though it's an option with IBM and HP blades. However IBM, and Dell offer a 10Gbit XFP uplink to the blade servers internal switch, and this has to be worthwhile with 56 CPUs on the other side of it.
I'm most concerned about whether anyone has tried the Netxen or Chelsio 10Gbit NICs on Centos 5.1; I see drivers in /lib/modules for these...
Also - do people have good / bad experiences of CX4 cabling? As an economical short range solution (<15M) it seems ideal for a server room, but I have a sales rep who is trying to scare me off, but he is biased as the 10Gb SR XFP transceivers are very expensive (~£820)...
Many thanks
Jake
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CX4 cabling is ok if you like playing with garden hoses. It refuses to go into cable arms and hasn't got much in the way of flexibility.
The cards are cheaper than fibre however the cables are expensive. Personally I think that fibre is the way forward with this.
I have used NetXen cards under RH and they seem fine.