I have a bonded interface running in mode 1 which is active/passive and no issue with this. I need to change it to mode 0 for active/active setup. Does mode 0 is dependent on the switches configuration? My setup is: 2 links from bonded interface is connected to different switches.
When I change to mode0 from mode1 , bond0 is not coming up.
These are the steps I performed
1) changed to options bond0 mode=0 miimon=100 from options bond0 mode=0 miimon=100 2) modprobe bonding 3) service network restart
Thanks ! Paras.
Paras pradhan wrote:
I have a bonded interface running in mode 1 which is active/passive and no issue with this. I need to change it to mode 0 for active/active setup. Does mode 0 is dependent on the switches configuration? My setup is: 2 links from bonded interface is connected to different switches.
You really should go the 802.3ad route (mode=4) if anything, this does require switch support. You can get unpredictable results with mode=0, and if you want best performance and availability stick to 802.3ad, which does require going to the same switch(or stack of switches).
Myself with bonding on linux I use only mode=1.
Another user like yourself posted on this topic a few months ago asking the same kind of question, and went down the non-802.3ad route and had major issues.
Also note that your single-stream performance will not exceed that of a single link between hosts. So if your doing a file transfer between two hosts for example and you have several 1GbE links between them the throughput of that transfer will not exceed 1Gbps. Load balancing is done on a per MAC/IP/tcp port basis depending on the equipment in use.
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
nate
Nate,
Thanks for you input. 802.3ad seems better but I am not in a position to terminate both links in the same switch or same stack. What about mode 6?
Thanks Paras.
On Wed, Jan 13, 2010 at 4:31 PM, nate centos@linuxpowered.net wrote:
Paras pradhan wrote:
I have a bonded interface running in mode 1 which is active/passive and
no
issue with this. I need to change it to mode 0 for active/active setup.
Does
mode 0 is dependent on the switches configuration? My setup is: 2 links
from
bonded interface is connected to different switches.
You really should go the 802.3ad route (mode=4) if anything, this does require switch support. You can get unpredictable results with mode=0, and if you want best performance and availability stick to 802.3ad, which does require going to the same switch(or stack of switches).
Myself with bonding on linux I use only mode=1.
Another user like yourself posted on this topic a few months ago asking the same kind of question, and went down the non-802.3ad route and had major issues.
Also note that your single-stream performance will not exceed that of a single link between hosts. So if your doing a file transfer between two hosts for example and you have several 1GbE links between them the throughput of that transfer will not exceed 1Gbps. Load balancing is done on a per MAC/IP/tcp port basis depending on the equipment in use.
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
nate
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Paras pradhan wrote:
Nate,
Thanks for you input. 802.3ad seems better but I am not in a position to terminate both links in the same switch or same stack. What about mode 6?
I have a NFS cluster running mode 6 with two systems, it works ok, been running for a bit over a year now. Each system has 4 NICs, and 4 IPs(load balancing done mostly via round robin DNS).
The systems run CentOS 4.x, they are basically appliances everything comes pre-configured by the vendor.
So I can say it can work, and does work when properly configured, though I would not use it myself. The vendor has since moved away from this and is going with 802.3ad for better standards compliance.
nate
Thanks nate.
Right now I am not concerned with the load balancing. I have few cluster nodes and if I run mode1 , then the failover time may lead to cluster timeout. I have not tested it yet but I will do it. So if the failover time from active to backup is very small and in miliseconds my cluster can afford that and I will stick with mode 0. Is there a way to change this time/durartion ? Thanks Paras.
On Wed, Jan 13, 2010 at 5:18 PM, nate centos@linuxpowered.net wrote:
Paras pradhan wrote:
Nate,
Thanks for you input. 802.3ad seems better but I am not in a position to terminate both links in the same switch or same stack. What about mode 6?
I have a NFS cluster running mode 6 with two systems, it works ok, been running for a bit over a year now. Each system has 4 NICs, and 4 IPs(load balancing done mostly via round robin DNS).
The systems run CentOS 4.x, they are basically appliances everything comes pre-configured by the vendor.
So I can say it can work, and does work when properly configured, though I would not use it myself. The vendor has since moved away from this and is going with 802.3ad for better standards compliance.
nate
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Paras pradhan wrote:
Thanks nate.
Right now I am not concerned with the load balancing. I have few cluster nodes and if I run mode1 , then the failover time may lead to cluster timeout. I have not tested it yet but I will do it. So if the failover time from active to backup is very small and in miliseconds my cluster can afford that and I will stick with mode 0. Is there a way to change this time/durartion ?
You can change the polling interval with the miimon option, but I'm not sure how precise you can get, it does take some time for the system to register a link failure, if that is infact what causes the network to fail. Another failure mode is a layer 2 failure which can detected in some cases using the arp monitor.
I think rather than trying to achieve zero loss on the network with regards to failures you should tune the cluster to be more tolerant and not raise a fit if the network is down for a second or two.
nate
On Wed, Jan 13, 2010 at 5:36 PM, nate centos@linuxpowered.net wrote:
Paras pradhan wrote:
Thanks nate.
Right now I am not concerned with the load balancing. I have few cluster nodes and if I run mode1 , then the failover time may lead to cluster timeout. I have not tested it yet but I will do it. So if the failover
time
from active to backup is very small and in miliseconds my cluster can
afford
that and I will stick with mode 0. Is there a way to change this time/durartion ?
You can change the polling interval with the miimon option, but I'm not sure how precise you can get, it does take some time for the system to register a link failure, if that is infact what causes the network to fail. Another failure mode is a layer 2 failure which can detected in some cases using the arp monitor.
I think rather than trying to achieve zero loss on the network with regards to failures you should tune the cluster to be more tolerant and not raise a fit if the network is down for
This makes sense nate. Thanks! Paras.
a second or two.
nate
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi,
Thanks for you input. 802.3ad seems better but I am not in a position to terminate both links in the same switch or same stack.
Some switches support LACP across several devices - for example the cisco 3750 with extended image can "glue" several switches together to one virtual switch and thus provide LACP support over several devices.
Of course there is a good deal of money involved.
By the way, using other modes over multiple switches involves using ISLs (inter switch links, that means direct connections between switches)*. If you use that you have to make sure algorithms that take time for recalculation (like spanning tree) do not interfere at the moment of a link failure because then your cluster communication maybe runs into a timeout also.
Dirk
* the reason is that you have to handle the following case: server a bonds to switch 1 and switch 2 with link 1a and 2a server b bonds to switch 1 and switch 2 with link 1b and 2b Now link 1a failes. And before you fix that, link 2b fails as well. Now you are glad to have an ISL. :-)
On 1/13/2010 4:31 PM, nate wrote:
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
Are there good 10Gb cards/drivers for the linux side?
Les Mikesell wrote:
Are there good 10Gb cards/drivers for the linux side?
My NAS vendor has deployed many systems with Chelsio 10GbE cards in the field, say they are good.
http://www.chelsio.com/products_10g_adapters.html http://service.chelsio.com/ <- drivers/etc
I have a friend who does 10GbE iSCSI testing for Broadcom, he recently started testing iSCSI offload for Broadcom/Linux, while the iSCSI(offload) stuff isn't quite so solid yet the regular 10GbE stuff is. I think for the most part Broadcom stuff is OEM these days.
Intel recently launched a low power 10GbaseT adapter(2nd or 3rd generation), I haven't heard much about it but my general experience with Intel GbE NICs has been positive
http://www.intel.com/Products/Server/Adapters/10GbE_AT/10GbE_AT-overview.htm
nate
On Wed, Jan 13, 2010 at 03:32:05PM -0800, nate wrote:
Les Mikesell wrote:
Are there good 10Gb cards/drivers for the linux side?
My NAS vendor has deployed many systems with Chelsio 10GbE cards in the field, say they are good.
http://www.chelsio.com/products_10g_adapters.html http://service.chelsio.com/ <- drivers/etc
I have a friend who does 10GbE iSCSI testing for Broadcom, he recently started testing iSCSI offload for Broadcom/Linux, while the iSCSI(offload) stuff isn't quite so solid yet the regular 10GbE stuff is. I think for the most part Broadcom stuff is OEM these days.
Interesting. Do you know how big CPU usage difference there is offloaded vs. non-offloaded?
10 Gbit iSCSI is very interesting :)
Intel recently launched a low power 10GbaseT adapter(2nd or 3rd generation), I haven't heard much about it but my general experience with Intel GbE NICs has been positive
http://www.intel.com/Products/Server/Adapters/10GbE_AT/10GbE_AT-overview.htm
This seems to based on the older 82598 chipset. 82599 is the newest Intel 10 Gbit chipset.
-- Pasi
On Wed, Jan 13, 2010 at 02:31:09PM -0800, nate wrote:
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
This reminded me of something. I remember reading some website (possibly Cisco's) earlier, and they mentioned 10 GBASE-T had much higher latency than other 10 Gbit options.
Have you paid attention to this? How big is the difference nowadays? Or I wonder if it was just on some specific product..
http://en.wikipedia.org/wiki/10_Gigabit_Ethernet says:
"10GBASE-T has higher latency and consumes more power than other 10 gigabit Ethernet physical layers. In 2008 10GBASE-T silicon is now available from several manufacturers with claimed power dissipation of 6W and a latency approaching 1 microsecond"
1 microsecond doesn't sound bad.. :)
-- Pasi
On Thu, Jan 14, 2010 at 10:08:55AM +0200, Pasi Kärkkäinen wrote:
On Wed, Jan 13, 2010 at 02:31:09PM -0800, nate wrote:
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
This reminded me of something. I remember reading some website (possibly Cisco's) earlier, and they mentioned 10 GBASE-T had much higher latency than other 10 Gbit options.
Have you paid attention to this? How big is the difference nowadays? Or I wonder if it was just on some specific product..
http://en.wikipedia.org/wiki/10_Gigabit_Ethernet says:
"10GBASE-T has higher latency and consumes more power than other 10 gigabit Ethernet physical layers. In 2008 10GBASE-T silicon is now available from several manufacturers with claimed power dissipation of 6W and a latency approaching 1 microsecond"
1 microsecond doesn't sound bad.. :)
http://www.bladenetwork.net/userfiles/file/PDFs/WP_10GbE_Cabling_Options_091...
That PDF claims this:
10GBase-T: - latency 2.6 us - power per port: 4-6W/port - price per port: $400 - max distance: 100m
10 Gbit SFP+: - latency 0.3 us - power per port: 1.5W - price per port: $40 - max distance: 8.5m
-- Pasi
On Thu, Jan 14, 2010 at 10:30:23AM +0200, Pasi Kärkkäinen wrote:
On Thu, Jan 14, 2010 at 10:08:55AM +0200, Pasi Kärkkäinen wrote:
On Wed, Jan 13, 2010 at 02:31:09PM -0800, nate wrote:
10GbE is really cheap these days(cheaper than 1GbE in some cases on a per Gb basis) if you need faster performance, and simple to configure, I wrote a blog on this a couple of months ago:
http://www.techopsguys.com/2009/11/17/affordable-10gbe-has-arrived/
This reminded me of something. I remember reading some website (possibly Cisco's) earlier, and they mentioned 10 GBASE-T had much higher latency than other 10 Gbit options.
Have you paid attention to this? How big is the difference nowadays? Or I wonder if it was just on some specific product..
http://en.wikipedia.org/wiki/10_Gigabit_Ethernet says:
"10GBASE-T has higher latency and consumes more power than other 10 gigabit Ethernet physical layers. In 2008 10GBASE-T silicon is now available from several manufacturers with claimed power dissipation of 6W and a latency approaching 1 microsecond"
1 microsecond doesn't sound bad.. :)
http://www.bladenetwork.net/userfiles/file/PDFs/WP_10GbE_Cabling_Options_091...
That PDF claims this:
10GBase-T:
- latency 2.6 us
- power per port: 4-6W/port
- price per port: $400
- max distance: 100m
10 Gbit SFP+:
- latency 0.3 us
- power per port: 1.5W
- price per port: $40
- max distance: 8.5m
And well, both of those are much better than with gigabit ethernet, so I guess one shouldn't pay too much attention to that.
-- Pasi
Pasi Kärkkäinen wrote:
10GBase-T:
- latency 2.6 us
- power per port: 4-6W/port
With the right gear this is much lower, only 1 switch on the market that is this good though the one mentioned in my blog, I'm sure others will follow at some point with the same or similar chipset, that vendor OEMs all their network chipsets these days so others have access to them too.
10 Gbit SFP+:
- latency 0.3 us
- power per port: 1.5W
- price per port: $40
They've obviously left out the cost of the GBIC.. which is typically several hundred $/port
nate
On Thu, Jan 14, 2010 at 06:52:01AM -0800, nate wrote:
Pasi Kärkkäinen wrote:
10GBase-T:
- latency 2.6 us
- power per port: 4-6W/port
With the right gear this is much lower, only 1 switch on the market that is this good though the one mentioned in my blog, I'm sure others will follow at some point with the same or similar chipset, that vendor OEMs all their network chipsets these days so others have access to them too.
Yeah.. I'm sure this marked will develop quickly now.
10 Gbit SFP+:
- latency 0.3 us
- power per port: 1.5W
- price per port: $40
They've obviously left out the cost of the GBIC.. which is typically several hundred $/port
Yep :)
-- Pasi
On Thursday 14 January 2010, nate wrote:
Pasi Kärkkäinen wrote:
10GBase-T:
- latency 2.6 us
- power per port: 4-6W/port
With the right gear this is much lower, only 1 switch on the market that is this good though the one mentioned in my blog, I'm sure others will follow at some point with the same or similar chipset, that vendor OEMs all their network chipsets these days so others have access to them too.
10 Gbit SFP+:
- latency 0.3 us
- power per port: 1.5W
- price per port: $40
They've obviously left out the cost of the GBIC.. which is typically several hundred $/port
You don't have to buy GBICs, you can run direct attached copper or EOE cables.
/Peter
Pasi Kärkkäinen wrote:
Have you paid attention to this? How big is the difference nowadays? Or I wonder if it was just on some specific product..
Depends on the product, my blog mentions a new product that draws less power on 10GbaseT vs fiber.
As for latency I'm sure it's a bit more, but for most applications are you really going to be able to tell a difference? I can understand if your doing stuff like RDMA, but for normal networking, NFS, iSCSI, virtualization etc, I really can't imagine anyone being able to tell a difference, especially for those currently running 1GbE.
The latency on my storage systems is measured in milliseconds not microseconds..
The biggest knock to 10GbaseT was it was late to the 10GbE party.
nate
On Thu, Jan 14, 2010 at 06:49:07AM -0800, nate wrote:
Pasi Kärkkäinen wrote:
Have you paid attention to this? How big is the difference nowadays? Or I wonder if it was just on some specific product..
Depends on the product, my blog mentions a new product that draws less power on 10GbaseT vs fiber.
As for latency I'm sure it's a bit more, but for most applications are you really going to be able to tell a difference? I can understand if your doing stuff like RDMA, but for normal networking, NFS, iSCSI, virtualization etc, I really can't imagine anyone being able to tell a difference, especially for those currently running 1GbE.
Exactly. Gigabit ethernet is already enough for many systems today, and even there the bottleneck is often the disks, not the network.
The latency on my storage systems is measured in milliseconds not microseconds..
Yep. It might be only important for RDMA stuff.
The biggest knock to 10GbaseT was it was late to the 10GbE party.
Then again 10G isn't very widely deployed yet (in the datacenter), so 10GBaseT will definitely have a place for it.
-- Pasi