Performance of CentOS as a NAT gateway

List overview All Threads
Download

newer

older

Downgrade samba?

Boot problem with CENTOS 5

Bart Schaefer

9 Sep 2007 9 Sep '07

5:33 a.m.

We have a single 3GHz P4 box w/2GB RAM running CentOS 3.8, acting as a gateway, which serves multiple IP address, having one virtual interface for each IP, e.g., eth0:1, eth0:2, etc. These interfaces/IPs are on the public internet. Each of these IP addresses is the NAT address for a different small LAN. All of these LANs are connected through a single Linksys 100Mb switch, to eth1 on the gateway. Thus, in case it's not obvious from that description, traffic from LAN X travels through through the switch to eth1 on the gateway, where iptables translates it to the IP address of eth0:X and thence out to the net.

The gateway is totally idle except for handling these NATs; no other processes except the usual OS bookkeeping. All NIC and switch hardware involved is 100Mb.

This all works, but we're experiencing network congestion somewhere. The LANs appear to become saturated when only about 10Mb of total traffic is passing through the public IPs. That is, we seem to be losing almost 90% of our capacity somewhere in the translation.

Before we attempt to sweep this under the rug by using Gb NICs/switches for the LANs, we'd like to understand what's going on. I can't find any recent statistics for Linux NAT performance, but the older stuff I can find (e.g. 50k packets/sec for a P3-450Mhz) seems to indicate that the gateway should easily be up to the task of handling the NAT traffic. Am I wrong about this? Is there any way to diagnose whether the NAT is the bottleneck? Would we benefit from upgrading to a newer CentOS (2.6 kernel as opposed to 2.4)? Or is it more likely to be the switch, in which case what would be a recommended replacement for the Linksys?

I can provide more details in private mail if necessary. Thanks in advance for any ideas.

Show replies by date

Barry Brimer

9 Sep 9 Sep

6:10 a.m.

On Sat, 8 Sep 2007, Bart Schaefer wrote:

...

We have a single 3GHz P4 box w/2GB RAM running CentOS 3.8, acting as a gateway, which serves multiple IP address, having one virtual interface for each IP, e.g., eth0:1, eth0:2, etc. These interfaces/IPs are on the public internet. Each of these IP addresses is the NAT address for a different small LAN. All of these LANs are connected through a single Linksys 100Mb switch, to eth1 on the gateway. Thus, in case it's not obvious from that description, traffic from LAN X travels through through the switch to eth1 on the gateway, where iptables translates it to the IP address of eth0:X and thence out to the net.

The gateway is totally idle except for handling these NATs; no other processes except the usual OS bookkeeping. All NIC and switch hardware involved is 100Mb.

This all works, but we're experiencing network congestion somewhere. The LANs appear to become saturated when only about 10Mb of total traffic is passing through the public IPs. That is, we seem to be losing almost 90% of our capacity somewhere in the translation.

Before we attempt to sweep this under the rug by using Gb NICs/switches for the LANs, we'd like to understand what's going on. I can't find any recent statistics for Linux NAT performance, but the older stuff I can find (e.g. 50k packets/sec for a P3-450Mhz) seems to indicate that the gateway should easily be up to the task of handling the NAT traffic. Am I wrong about this? Is there any way to diagnose whether the NAT is the bottleneck? Would we benefit from upgrading to a newer CentOS (2.6 kernel as opposed to 2.4)? Or is it more likely to be the switch, in which case what would be a recommended replacement for the Linksys?

Have you checked speed and duplex settings? If you want to make sure that your CentOS 3 is not the bottleneck, there are CentOS 4 and CentOS 5 Live CDs you could test.

Barry

Bart Schaefer

8:36 a.m.

On 9/8/07, Barry Brimer lists@brimer.org wrote:

...

Have you checked speed and duplex settings?

All NICs on all machines involved report exactly the same:

negotiated 100baseTx-FD flow-control, link ok

We've also checked ifconfig on all interfaces, and no errors, dropped packets, overruns, nor collisions have been recorded.

Barry Brimer

9 a.m.

...

All NICs on all machines involved report exactly the same:

negotiated 100baseTx-FD flow-control, link ok

We've also checked ifconfig on all interfaces, and no errors, dropped packets, overruns, nor collisions have been recorded.

Great! Is your upstream device also able to talk at 100 Mb/s?

You might also try running some point to point bandwidth tests with iperf http://dast.nlanr.net/Projects/Iperf/ to try and isolate which machine(s) are having issues. ntop http://www.ntop.org and iptraf (included in CentOS) can also be useful to monitor bandwidth/throughput as well.

Barry

Mark D. Foster

9:53 a.m.

Bart Schaefer wrote:

...

On 9/8/07, Barry Brimer lists@brimer.org wrote:

...
Have you checked speed and duplex settings?

All NICs on all machines involved report exactly the same:

negotiated 100baseTx-FD flow-control, link ok

We've also checked ifconfig on all interfaces, and no errors, dropped packets, overruns, nor collisions have been recorded.

You might try *not* auto-negotiating as described here: http://mark.foster.cc/wiki/index.php/Ethtool

-- Said one park ranger, 'There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists.' Mark D. Foster, CISSP mark@foster.cc http://mark.foster.cc/

Jordi Espasa Clofent

10 Sep 10 Sep

11:09 a.m.

You has said that box makes only routing functions, so... it's not a CentOS related item, but maybe you should to consider to purchase and learn to manage a pfSense appliance[1]. It's simply wonderful.

http://www.pfsense.com/

-- Thanks, Jordi Espasa Clofent

Ross S. W. Walker

9 Sep 9 Sep

9:26 a.m.

Bart Schaefer wrote:

...

We have a single 3GHz P4 box w/2GB RAM running CentOS 3.8, acting as a gateway, which serves multiple IP address, having one virtual interface for each IP, e.g., eth0:1, eth0:2, etc. These interfaces/IPs are on the public internet. Each of these IP addresses is the NAT address for a different small LAN. All of these LANs are connected through a single Linksys 100Mb switch, to eth1 on the gateway. Thus, in case it's not obvious from that description, traffic from LAN X travels through through the switch to eth1 on the gateway, where iptables translates it to the IP address of eth0:X and thence out to the net.

The gateway is totally idle except for handling these NATs; no other processes except the usual OS bookkeeping. All NIC and switch hardware involved is 100Mb.

This all works, but we're experiencing network congestion somewhere. The LANs appear to become saturated when only about 10Mb of total traffic is passing through the public IPs. That is, we seem to be losing almost 90% of our capacity somewhere in the translation.

Before we attempt to sweep this under the rug by using Gb NICs/switches for the LANs, we'd like to understand what's going on. I can't find any recent statistics for Linux NAT performance, but the older stuff I can find (e.g. 50k packets/sec for a P3-450Mhz) seems to indicate that the gateway should easily be up to the task of handling the NAT traffic. Am I wrong about this? Is there any way to diagnose whether the NAT is the bottleneck? Would we benefit from upgrading to a newer CentOS (2.6 kernel as opposed to 2.4)? Or is it more likely to be the switch, in which case what would be a recommended replacement for the Linksys?

I can provide more details in private mail if necessary. Thanks in advance for any ideas.

The setup is more then capable at running 100Mbps full-out routing and NATing.

Has the Internet interface reached it's max capacity?

10Mbps is a lot of traffic on even a FIOS connection.

Or are you saying that LAN-to-LAN traffic maxs out at 10Mbps, it is a little vague.

-Ross

______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.

Bart Schaefer

10 Sep 10 Sep

12:07 a.m.

On 9/8/07, Ross S. W. Walker rwalker@medallion.com wrote:

...

Has the Internet interface reached it's max capacity?

No.

...

Or are you saying that LAN-to-LAN traffic maxs out at 10Mbps, it is a little vague.

LAN-to-gateway traffic (e.g., a test FTP of a large file from the gateway to a machine on one of the LANs) begins to degrade as the LAN-to-internet traffic increases. That's not surprising, but it degrades disproportionately, i.e. when the FTP begins to show intermittent stalls, the total traffic visible at the router on the internet side of the gateway is only in the just-over-10Mb/s range.

Once we get to this point, no matter how many more LAN-to-internet connections become active, the router on the internet side never sees much over 10Mb/s of traffic. We're not losing data or having an unusual number of connection timeouts; each connection just slows down. We figured on some slowdown for NAT, but not 80%+.

LAN-to-LAN traffic that doesn't involve the gateway behaves more like we'd expect, but I'm not sure that eliminates the switch as the culprit.

Barry Brimer

2:59 a.m.

...

LAN-to-gateway traffic (e.g., a test FTP of a large file from the gateway to a machine on one of the LANs) begins to degrade as the LAN-to-internet traffic increases. That's not surprising, but it degrades disproportionately, i.e. when the FTP begins to show intermittent stalls, the total traffic visible at the router on the internet side of the gateway is only in the just-over-10Mb/s range.

Once we get to this point, no matter how many more LAN-to-internet connections become active, the router on the internet side never sees much over 10Mb/s of traffic. We're not losing data or having an unusual number of connection timeouts; each connection just slows down. We figured on some slowdown for NAT, but not 80%+.

LAN-to-LAN traffic that doesn't involve the gateway behaves more like we'd expect, but I'm not sure that eliminates the switch as the culprit.

Maybe it is time for some kernel networking tuning.

This will definetly require more memory, but should speed things up. This is on a CentOS 4 machine .. I don't have a CentOS 3 machine to test on.

Add the following lines to /etc/sysctl.conf

net.core.rmem_default = 67108864 net.core.wmem_default = 67108864 net.core.rmem_max = 67108864 net.core.wmem_max = 67108864 net.ipv4.tcp_mem = 4096 67108864 67108864 net.ipv4.tcp_rmem = 4096 67108864 67108864 net.ipv4.tcp_wmem = 4096 67108864 67108864 net.ipv4.ip_local_port_range = 32768 65535 net.ipv4.tcp_max_syn_backlog = 8192

After adding these lines, run "sysctl -p"

Hope this helps.

Barry

Bart Schaefer

10:19 a.m.

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...

Maybe it is time for some kernel networking tuning.

Add the following lines to /etc/sysctl.conf

Thanks, will try. Question: Why does ip_local_port_range matter?

Bart Schaefer

11 Sep 11 Sep

1:20 a.m.

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...

Maybe it is time for some kernel networking tuning.

After doing a bit of research:

http://www.acc.umu.se/~maswan/linux-netperf.txt http://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php http://proj.sunet.se/E2E/tcptune.html http://www.linuxguruz.com/iptables/howto/2.4routing-13.html http://www-didc.lbl.gov/TCP-tuning/linux.html

I ended up with this:

net.core.rmem_default = 873800 net.core.wmem_default = 655360 net.core.rmem_max = 8738000 net.core.wmem_max = 6553600 net.ipv4.tcp_rmem = 8192 873800 8738000 net.ipv4.tcp_wmem = 4096 655360 6553600 net.ipv4.tcp_mem = 195584 873800 8738000

(The first number in tcp_mem is the original default.)

Plus:

ifconfig eth0 txqueuelen 1000 ifconfig eth1 txqueuelen 1000

Unfortunately so far this doesn't seem to have made any difference. We've had a load peak going since early this morning and the traffic looks exactly like it did last week.

Ross S. W. Walker

2:05 a.m.

Bart Schaefer wrote:

...

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...
Maybe it is time for some kernel networking tuning.

After doing a bit of research:

http://www.acc.umu.se/~maswan/linux-netperf.txt http://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php http://proj.sunet.se/E2E/tcptune.html http://www.linuxguruz.com/iptables/howto/2.4routing-13.html http://www-didc.lbl.gov/TCP-tuning/linux.html

I ended up with this:

net.core.rmem_default = 873800 net.core.wmem_default = 655360 net.core.rmem_max = 8738000 net.core.wmem_max = 6553600 net.ipv4.tcp_rmem = 8192 873800 8738000 net.ipv4.tcp_wmem = 4096 655360 6553600 net.ipv4.tcp_mem = 195584 873800 8738000

(The first number in tcp_mem is the original default.)

Plus:

ifconfig eth0 txqueuelen 1000 ifconfig eth1 txqueuelen 1000

Unfortunately so far this doesn't seem to have made any difference. We've had a load peak going since early this morning and the traffic looks exactly like it did last week.

The only way your going to know for absolute sure where the bottleneck exists is to do a wireshark/tcpdump trace simultaneously on both sides.

Then with that information you will know where the bottleneck is and armed with that you can start exploring why there is a bottleneck there.

Off the top of my head, there could be a IP MTU mismatch somewhere and with ICMP disabled this would cause a blackhole for some full packet traffic.

-Ross

gjgowey＠tmo.blackberry.net

3:46 a.m.

Speaking of MTU mismatches, don't forget that if you're using a PPPoE DSL line to adjust your MTU.

Geoff

Sent from my BlackBerry wireless handheld.

-----Original Message----- From: "Ross S. W. Walker" rwalker@medallion.com

Date: Mon, 10 Sep 2007 16:35:59 To:"CentOS mailing list" centos@centos.org Subject: RE: [CentOS] Performance of CentOS as a NAT gateway

Bart Schaefer wrote:

...

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...
Maybe it is time for some kernel networking tuning.

After doing a bit of research:

http://www.acc.umu.se/~maswan/linux-netperf.txt http://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php http://proj.sunet.se/E2E/tcptune.html http://www.linuxguruz.com/iptables/howto/2.4routing-13.html http://www-didc.lbl.gov/TCP-tuning/linux.html

I ended up with this:

net.core.rmem_default = 873800 net.core.wmem_default = 655360 net.core.rmem_max = 8738000 net.core.wmem_max = 6553600 net.ipv4.tcp_rmem = 8192 873800 8738000 net.ipv4.tcp_wmem = 4096 655360 6553600 net.ipv4.tcp_mem = 195584 873800 8738000

(The first number in tcp_mem is the original default.)

Plus:

ifconfig eth0 txqueuelen 1000 ifconfig eth1 txqueuelen 1000

Unfortunately so far this doesn't seem to have made any difference. We've had a load peak going since early this morning and the traffic looks exactly like it did last week.

The only way your going to know for absolute sure where the bottleneck exists is to do a wireshark/tcpdump trace simultaneously on both sides.

Then with that information you will know where the bottleneck is and armed with that you can start exploring why there is a bottleneck there.

Off the top of my head, there could be a IP MTU mismatch somewhere and with ICMP disabled this would cause a blackhole for some full packet traffic.

-Ross

_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Les Mikesell

3:17 a.m.

Bart Schaefer wrote:

...

...
Or are you saying that LAN-to-LAN traffic maxs out at 10Mbps, it is a little vague.

LAN-to-gateway traffic (e.g., a test FTP of a large file from the gateway to a machine on one of the LANs) begins to degrade as the LAN-to-internet traffic increases. That's not surprising, but it degrades disproportionately, i.e. when the FTP begins to show intermittent stalls, the total traffic visible at the router on the internet side of the gateway is only in the just-over-10Mb/s range.

Once we get to this point, no matter how many more LAN-to-internet connections become active, the router on the internet side never sees much over 10Mb/s of traffic. We're not losing data or having an unusual number of connection timeouts; each connection just slows down. We figured on some slowdown for NAT, but not 80%+.

LAN-to-LAN traffic that doesn't involve the gateway behaves more like we'd expect, but I'm not sure that eliminates the switch as the culprit.

How much 'other stuff' is happening on these networks (either side) that might be passed by the switches? It's a long shot but if you've assigned multiple IP addresses to the interface, the card is probably going into promiscuous mode to accept them all and then there will be interrupts and a small amount of CPU work to discard the ones you don't need. It might be worth firing up something like ntop for a while to categorize what's really going by - and you might find something like a virus trying to make connections as fast as it can.

-- Les Mikesell lesmikesell@gmail.com

Robert - elists

9 Sep 9 Sep

7:43 p.m.

...

We have a single 3GHz P4 box w/2GB RAM running CentOS 3.8, acting as a gateway, which serves multiple IP address, having one virtual interface for each IP, e.g., eth0:1, eth0:2, etc. These interfaces/IPs are on the public internet. Each of these IP addresses is the NAT address for a different small LAN. All of these LANs are connected through a single Linksys 100Mb switch, to eth1 on the gateway. Thus, in case it's not obvious from that description, traffic from LAN X travels through through the switch to eth1 on the gateway, where iptables translates it to the IP address of eth0:X and thence out to the net.

The gateway is totally idle except for handling these NATs; no other processes except the usual OS bookkeeping. All NIC and switch hardware involved is 100Mb.

This all works, but we're experiencing network congestion somewhere. The LANs appear to become saturated when only about 10Mb of total traffic is passing through the public IPs. That is, we seem to be losing almost 90% of our capacity somewhere in the translation.

Before we attempt to sweep this under the rug by using Gb NICs/switches for the LANs, we'd like to understand what's going on. I can't find any recent statistics for Linux NAT performance, but the older stuff I can find (e.g. 50k packets/sec for a P3-450Mhz) seems to indicate that the gateway should easily be up to the task of handling the NAT traffic. Am I wrong about this? Is there any way to diagnose whether the NAT is the bottleneck? Would we benefit from upgrading to a newer CentOS (2.6 kernel as opposed to 2.4)? Or is it more likely to be the switch, in which case what would be a recommended replacement for the Linksys?

I can provide more details in private mail if necessary. Thanks in advance for any ideas.

What switch is it?

Evidentally, there much be a switch on the virtualized eth0:x side too... are you in control of that?

What kind is it?

Are you aggregating your upstreams on one Ethernet link? Can you separate them out with individual physical Ethernet interfaces?

- rh

Bart Schaefer

10:59 p.m.

On 9/9/07, Robert - elists lists07@abbacomm.net wrote:

...

What switch is it?

LinkSys Etherfast, a couple of years old now (I'd have to go to our colocation site to look in the cabinet to get the exact model). it's a plain dumb switch, no management interface.

...

Evidentally, there much be a switch on the virtualized eth0:x side too... are you in control of that?

The other side is a high-end Cisco router managed by our ISP. Its their router statistics that tell us we're peaking at just over 10Mb/s coming out of the gateway box. That was where we first assumed the problem must be, so we've been working with them on this problem for some while now and have pretty definitely eliminated their equipment as the bottleneck.

...

Are you aggregating your upstreams on one Ethernet link?

Yes, but our bandwidth needs are well below the capacity of that link. It's just that at peak times we need more than is making it through.

Barry Brimer

11:18 p.m.

...

The other side is a high-end Cisco router managed by our ISP. Its their router statistics that tell us we're peaking at just over 10Mb/s coming out of the gateway box. That was where we first assumed the problem must be, so we've been working with them on this problem for some while now and have pretty definitely eliminated their equipment as the bottleneck.

What is the speed of the link between you and the ISP?

Do they have other customer sites that are set up the same way as yours that get significantly better performance?

Bart Schaefer

10 Sep 10 Sep

12:12 a.m.

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...

What is the speed of the link between you and the ISP?

100Mb/s.

...

Do they have other customer sites that are set up the same way as yours that get significantly better performance?

They don't have any other sites set up this way to compare.

Ross S. W. Walker

12:33 a.m.

Bart Schaefer wrote:

...

On 9/9/07, Barry Brimer lists@brimer.org wrote:

...
What is the speed of the link between you and the ISP?

100Mb/s.

...
Do they have other customer sites that are set up the same

way as yours

...
that get significantly better performance?

They don't have any other sites set up this way to compare.

I would then suggest getting a wireshark setup to monitor the traffic as it passes through the gateway.

If the gateway host doesn't have a GUI then I suggest another host with 2 interfaces, one on each side of the NAT and compare the timestamps of traffic coming in from the Internet to the NAT, out of the NAT to the client, in to the NAT from the client and out of the NAT to the Internet.

By looking at how fast traffic goes out to and is ack'd by the remote site can tell you exactly where the bottlenecj is.

-Ross

David Hrbác(

11:33 a.m.

Bart Schaefer napsal(a):

...

I can't find any recent statistics for Linux NAT performance, but the older stuff I can find (e.g. 50k packets/sec for a P3-450Mhz) seems to indicate that the gateway should easily be up to the task of handling the NAT traffic. Am I wrong about this? Is there any way to diagnose whether the NAT is the bottleneck? Would we benefit from upgrading to a newer CentOS (2.6 kernel as opposed to 2.4)? Or is it more likely to be the switch, in which case what would be a recommended replacement for the Linksys?

Bart, how many connections are on the router (/proc/net/ip_conntrack) ? And what's the /proc/sys/net/ipv4/ip_conntrack_max David

Bart Schaefer

11:51 a.m.

On 9/9/07, David Hrbác( hrbac.conf@seznam.cz wrote:

...

how many connections are on the router (/proc/net/ip_conntrack) ?

This is way off-peak time for us (middle of Sunday night PDT) so I suspect looking at this right now is not very useful, but:

# cat /proc/net/ip_conntrack | wc -l 15140 # cat /proc/net/ip_conntrack | fgrep -v UNREPLIED | wc -l 586

...

what's the /proc/sys/net/ipv4/ip_conntrack_max

# cat /proc/sys/net/ipv4/ip_conntrack_max 65536

David Hrbác(

11:59 a.m.

Bart Schaefer napsal(a):

...

This is way off-peak time for us (middle of Sunday night PDT) so I suspect looking at this right now is not very useful, but:

Well, it's really way-off now. I dare to say it's conntrack anyway. If there are client behind NAT using P2P... then 1 client can have thousands or even tens of thousand connections. Please do report during peak and net issue time. David

Bart Schaefer

9:59 p.m.

On 9/9/07, David Hrbác( hrbac.conf@seznam.cz wrote:

...

Bart Schaefer napsal(a):

...
This is way off-peak time for us (middle of Sunday night PDT) so I suspect looking at this right now is not very useful, but:

Please do report during peak and net issue time.

We're having a spike right now. Doesn't look much different, though:

# wc -l /proc/net/ip_conntrack 17141 /proc/net/ip_conntrack # fgrep -cv UNRE /proc/net/ip_conntrack 1310

David Hrbác(

11 Sep 11 Sep

12:32 p.m.

Bart Schaefer napsal(a):

...

We're having a spike right now. Doesn't look much different, though:

# wc -l /proc/net/ip_conntrack 17141 /proc/net/ip_conntrack # fgrep -cv UNRE /proc/net/ip_conntrack 1310

What are the upstream link parameters (type, up, down, ...), what's the ping on gateway and 1K ping. David

Guy Boisvert

12:03 a.m.

Bart Schaefer wrote:

...

On 9/9/07, David Hrbác( hrbac.conf@seznam.cz wrote:

...
how many connections are on the router (/proc/net/ip_conntrack) ?

This is way off-peak time for us (middle of Sunday night PDT) so I suspect looking at this right now is not very useful, but:

# cat /proc/net/ip_conntrack | wc -l 15140 # cat /proc/net/ip_conntrack | fgrep -v UNREPLIED | wc -l 586

...
what's the /proc/sys/net/ipv4/ip_conntrack_max

# cat /proc/sys/net/ipv4/ip_conntrack_max 65536

On top of that, i'd say that a PC, with whatever processor you could put, is able to service a certain amount of interrupts / second. Sometimes, you can have also cards / integrated peripherals that are sharing IRQs and have trouble with it. So in the case of a PC router, i'd go into the BIOS setup and disable all the integrated peripheral you don't use (LPT port, integrated sound card, etc). Maybe you already did this, i don't know.

There are some ways to improve performance like what Cisco does: having line cards doing processing and getting pointers from the main supervisor card and dealing with traffic locally afterward. In our PC case here, this could translate into using at least TCP offloading and flow control (ethernet level). Also, consider that not all ethernet cards are equal and that using 802.1Q (trunking) also change the game. Good cards have features to deal with all this.

Somebody mentioned pfSense. I use it and there is an option that can boost the performance: Using device polling instead of relying on interrupts generated by cards. I dunno if CentOS has this kind of option, the ethernet gurus of this list could provide important information on that.

Hope this helped.

Guy Boisvert, ing. IngTegration inc.

Guy Boisvert

12:21 a.m.

Guy Boisvert wrote:

...

On top of that, i'd say that a PC, with whatever processor you could put, is able to service a certain amount of interrupts / second.

[Snip...]

...

Somebody mentioned pfSense. I use it and there is an option that can boost the performance: Using device polling instead of relying on interrupts generated by cards. I dunno if CentOS has this kind of option, the ethernet gurus of this list could provide important information on that.

Hope this helped.

Guy Boisvert, ing.

I don't like replying to myself, but this paper could give you an idea of what's going on inside Linux:

http://www.ist-scampi.org/events/workshop-2004/deri.pdf

Regards,

Guy Boisvert, ing. IngTegration inc.

Bart Schaefer

12:31 a.m.

On 9/10/07, Guy Boisvert boisvert.guy@videotron.ca wrote:

...

On top of that, i'd say that a PC, with whatever processor you could put, is able to service a certain amount of interrupts / second.

# cat /proc/interrupts CPU0 CPU1 0: 35564628 1398173774 IO-APIC-edge timer 1: 3 0 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 8: 1 0 IO-APIC-edge rtc 14: 9807057 17139257 IO-APIC-edge ide0 15: 2 0 IO-APIC-edge ide1 16: 0 0 IO-APIC-level usb-uhci 18: 820860470 0 IO-APIC-level eth0 19: 8 2493787389 IO-APIC-level usb-uhci, eth1 23: 0 0 IO-APIC-level ehci-hcd NMI: 0 0 LOC: 1433915007 1433915017 ERR: 0 MIS: 0

Bart Schaefer

4:10 a.m.

On 9/10/07, Bart Schaefer barton.schaefer@gmail.com wrote:

...

On 9/10/07, Guy Boisvert boisvert.guy@videotron.ca wrote:

...
On top of that, i'd say that a PC, with whatever processor you could put, is able to service a certain amount of interrupts / second.

# cat /proc/interrupts

Ok, so obviously just that snapshot wasn't very useful, sorry.

Several minutes of "vmstat 2" output indicates that the number of interrupts per second on the NAT gateway ranges between 600 and 1200, occasionally as low as 540 or as high as 1300.

(Working on getting wireshark in a place where we can monitor both ends.)

John R Pierce

4:54 a.m.

Bart Schaefer wrote:

...

On 9/10/07, Bart Schaefer barton.schaefer@gmail.com wrote:

...
On 9/10/07, Guy Boisvert boisvert.guy@videotron.ca wrote:

...
On top of that, i'd say that a PC, with whatever processor you could put, is able to service a certain amount of interrupts / second.

# cat /proc/interrupts

Ok, so obviously just that snapshot wasn't very useful, sorry.

Several minutes of "vmstat 2" output indicates that the number of interrupts per second on the NAT gateway ranges between 600 and 1200, occasionally as low as 540 or as high as 1300.

(Working on getting wireshark in a place where we can monitor both ends.)

wireshark can process and display packet capture files from tcpdump -w filename, or most any other packet capture utility, as well as capture data itself.

capture a few megabytes of packets on the appropriate interface of the firewall, then transfer them to a workstation with Wireshark for analysis.

Bart Schaefer

7:01 a.m.

On 9/10/07, John R Pierce pierce@hogranch.com wrote:

...

wireshark can process and display packet capture files from tcpdump -w

capture a few megabytes of packets on the appropriate interface of the firewall, then transfer them to a workstation with Wireshark for analysis.

OK, I've got some output from "tcpdump -w any" but I don't know precisely what I'm looking for. (I'd be happy to take this off-list.) I notice that just over 1/3 of the packets are TCP out-of-order segments and about 4% are duplicate ACKs.

We also dumped eth0 and eth1 separately. Statistics on the "any" output show 26Mb/s, but eth0 and eth1 independently are only 10Mb/s each.

gjgowey＠tmo.blackberry.net

7:33 a.m.

How about putting the file contents on pastebin and posting the link?

Geoff Sent from my BlackBerry wireless handheld.

-----Original Message----- From: "Bart Schaefer" barton.schaefer@gmail.com

Date: Mon, 10 Sep 2007 18:31:31 To:"CentOS mailing list" centos@centos.org Subject: Re: [CentOS] Performance of CentOS as a NAT gateway

On 9/10/07, John R Pierce pierce@hogranch.com wrote:

...

wireshark can process and display packet capture files from tcpdump -w

capture a few megabytes of packets on the appropriate interface of the firewall, then transfer them to a workstation with Wireshark for analysis.

We also dumped eth0 and eth1 separately. Statistics on the "any" output show 26Mb/s, but eth0 and eth1 independently are only 10Mb/s each.

By the way, those interrupts/sec numbers in my earlier message were off; I chose a bad moment to look at it, when the peak had subsided. At peak it's more like 2500-3000 interrupts/sec, sometimes as high as 3500. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Bart Schaefer

7:56 a.m.

On 9/10/07, gjgowey@tmo.blackberry.net gjgowey@tmo.blackberry.net wrote:

...

How about putting the file contents on pastebin and posting the link?

Unfortunately there's customer data in there that I'm not at liberty to make public.

gjgowey＠tmo.blackberry.net

8:08 a.m.

If you feel like learning sed ;) you can use it to filter out that data using regexp's and have it create a new file that can be publicly posted.

Geoff

Sent from my BlackBerry wireless handheld.

-----Original Message----- From: "Bart Schaefer" barton.schaefer@gmail.com

Date: Mon, 10 Sep 2007 19:26:51 To:"CentOS mailing list" centos@centos.org Subject: Re: [CentOS] Performance of CentOS as a NAT gateway

On 9/10/07, gjgowey@tmo.blackberry.net gjgowey@tmo.blackberry.net wrote:

...

How about putting the file contents on pastebin and posting the link?

Unfortunately there's customer data in there that I'm not at liberty to make public. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Bart Schaefer

9:18 a.m.

On 9/10/07, gjgowey@tmo.blackberry.net gjgowey@tmo.blackberry.net wrote:

...

If you feel like learning sed ;)

I suspect I've been scripting sed since you were about 7 years old. :-) I don't think even recent GNU sed is going to handle tcpdump output very well.

gjgowey＠tmo.blackberry.net

9:38 a.m.

Scripting in sed for 20+ years? Masochist! :-)

Geoff

Sent from my BlackBerry wireless handheld.

-----Original Message----- From: "Bart Schaefer" barton.schaefer@gmail.com

Date: Mon, 10 Sep 2007 20:48:21 To:"CentOS mailing list" centos@centos.org Subject: Re: [CentOS] Performance of CentOS as a NAT gateway

On 9/10/07, gjgowey@tmo.blackberry.net gjgowey@tmo.blackberry.net wrote:

...

If you feel like learning sed ;)

I suspect I've been scripting sed since you were about 7 years old. :-) I don't think even recent GNU sed is going to handle tcpdump output very well. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Ross S. W. Walker

7:34 p.m.

Bart Schaefer wrote:

...

On 9/10/07, John R Pierce pierce@hogranch.com wrote:

...
wireshark can process and display packet capture files from

tcpdump -w

...
capture a few megabytes of packets on the appropriate

interface of the

...
firewall, then transfer them to a workstation with

Wireshark for analysis.

OK, I've got some output from "tcpdump -w any" but I don't know precisely what I'm looking for. (I'd be happy to take this off-list.) I notice that just over 1/3 of the packets are TCP out-of-order segments and about 4% are duplicate ACKs.

We also dumped eth0 and eth1 separately. Statistics on the "any" output show 26Mb/s, but eth0 and eth1 independently are only 10Mb/s each.

By the way, those interrupts/sec numbers in my earlier message were off; I chose a bad moment to look at it, when the peak had subsided. At peak it's more like 2500-3000 interrupts/sec, sometimes as high as 3500.

int/sec is fine for your hardware.

Try a tcpdump of both the external and internal interface at the same time. Try to focus on 1 proto-typical stream of traffic from a known host (like your own) to a known destination from connection open to connection close.

Then open up the dump in wireshark and look at the timestamps and if there are any resends with smaller MTUs and such.

You want to see if there is a large delay between sent packets and ACKs.

-Ross

Toby Bluhm

1:06 a.m.

http://www.vyatta.com/download/ - runs on plain old PC hardware and it's touted as being a Cisco beater.

-- Toby Bluhm Midwest Instruments Inc. 30825 Aurora Road Suite 100 Solon Ohio 44139 440-424-2250

6511

Age (days ago)

6513

Last active (days ago)

discuss@lists.centos.org

36 comments

12 participants

tags (0)

participants (12)

Barry Brimer
Bart Schaefer
David Hrbác(
gjgowey＠tmo.blackberry.net
Guy Boisvert
John R Pierce
Jordi Espasa Clofent
Les Mikesell
Mark D. Foster
Robert - elists
Ross S. W. Walker
Toby Bluhm