OT: CentOS server with 2 GbE links to 2 GbE switches

List overview All Threads
Download

newer

older

RE: [CentOS] Mount floppy on...

screen shifting

Patrick

26 Aug 2005 26 Aug '05

1:36 p.m.

Hi all,

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). Here is the idea:

----------- | GbE | PCs ------------| switch |------------ | | ----------- | | ----------------- ----------------- ---| Workgoup Switch | | CentOS/Asterisk | | ----------------- ----------------- | | ----------- | VoIP ------------| GbE |------------ Phones | switch | -----------

How would I acomplish this? Can I use IP addresses from one IP network (say 10.0.0.0/24) to assign to the 2 LAN ports on the CentOS server and a port on each of the GbE switches and then use something like OSPF on the switches and the CentOS box to do the routing? Any other ideas?

Many thanks for your suggestions.

Regards, Patrick

Show replies by date

Chris Mauritz

26 Aug 26 Aug

1:56 p.m.

Patrick wrote:

...

Hi all,

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). Here is the idea:
                        -----------
                       |    GbE    |   
PCs ------------| switch |------------ | | ----------- | | ----------------- ----------------- ---| Workgoup Switch | | CentOS/Asterisk | | ----------------- ----------------- | | ----------- | VoIP ------------| GbE |------------ Phones | switch | -----------

Why don't you just bond the two interfaces? An alternative would be to use one interface as a primary link and periodically check its "hearbeat". If the interface goes down, you just add the route for the other interface to your default gateway. Bonding the 2 channels is probably the easier method, though.

Cheers,

Bryan J. Smith

3:36 p.m.

Patrick centos@puzzled.xs4all.nl wrote:

...

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). How would I acomplish this?

First off, doing it at the layer-3/IP level with dynamic routes is far more overhead than is required. In your case, you're just looking for layer-2/802 level. So leverage what standard 802 offers if you can.

I'm more of an academic, so the first thing I recommend to people is that they get familar with the standard capabilities of 802. More explicitly, research 802.1d Spanning Tree Protocol (STP) as well as newer standards like 802.3ad Link Aggregation. In fact, it's this latter addition that really makes things very easy.

In the "good old days," you'd setup a single, virtual UNIX interface bridged to two. Your system only knows about the single, virtual UNIX interface. But it would then leverage two interfaces, only bringing the other interface up if one failed. With support for STP, loops would be avoided. The only thing to worry about with STP is the maximum number of hops in a layer-2 network -- 7. This, of course, requires both your host (software) and network stack (firmware) to support STP.

In the "new, better days" we now have 802.3ad Link Aggregation. Now you can get more bandwidth and failover at the same time. Again, both your host (NIC firmware) and network stack (firmware) need to support 802.3ad Link Aggregation. But if it does, it becomes very, very easy to configure a single IP address to a pair of NICs, and aggregate both to two different ports in a network stack.

Now if you're using cheap network equipment, I don't know what to tell you. Layer-2 is probably out then.

...

Can I use IP addresses from one IP network (say

10.0.0.0/24)

...

to assign to the 2 LAN ports on the CentOS server and a

port

...

on each of the GbE switches and then use something like

OSPF

...

on the switches and the CentOS box to do the routing?

You can maybe use layer-3 hacks and tweaks to deal with things, but it's very difficult to handle the failover without support at the concentrator end. Your idea to use different IP addresses and dynamic routing tables is probably the best way. But it's definitely not as clean.

Especially with 802.3ad Link Aggregation being supported more and more.

...

Any other ideas?

If you let me know what your networking equipment and/or budgetary constraints are, I can help you further. You'd be surprised how little this actually costs, but if you're using $200 GbE switches, then I can't help you with layer-2.

-- Bryan J. Smith | Sent from Yahoo Mail mailto:b.j.smith@ieee.org | (please excuse any http://thebs413.blogspot.com/ | missing headers)

Patrick

5:23 p.m.

On Fri, 2005-08-26 at 08:36 -0700, Bryan J. Smith wrote: [snip

Thanks for the background info. Much appreciated.

...

If you let me know what your networking equipment and/or budgetary constraints are, I can help you further. You'd be surprised how little this actually costs, but if you're using $200 GbE switches, then I can't help you with layer-2.

The switches are most likely Cisco kit so I guess it would be a Cisco 3560G-24TS-24 GbE switch which is a relatively new model with current IOS. A quick google shows they support LACP/IEEE 802.3ad so that looks good. The server brand is unkown to me (if that matters) but obviously it needs to have 2x GbE LAN ports. OS is CentOS 4.1. At the end of the day if Link Aggregation is the way to go, we will make the design use it. It's all IPv4 by the way, no IPv6 support required.

Regards, Patrick

Bryan J. Smith

5:38 p.m.

Patrick centos@puzzled.xs4all.nl wrote:

...

A quick google shows they support LACP/IEEE 802.3ad so that looks good.

You're in business out-of-the-box if you're running CentOS4 (kernel 2.6). 802.3ad makes things simple.

...

It's all IPv4 by the way, no IPv6 support required.

Shouldn't matter either way. Although IPv6 uses the layer-2 MAC address as the lower 48-bits of its address by default (at least for the LINKLOCAL, if not the SITELOCAL), that's still provided at layer-2. So the aggregation will address that even before the IPv6 is assigned.

-- Bryan J. Smith | Sent from Yahoo Mail mailto:b.j.smith@ieee.org | (please excuse any http://thebs413.blogspot.com/ | missing headers)

rado

4:02 p.m.

On Fri, 2005-08-26 at 15:36 +0200, Patrick wrote:

...

Hi all,

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). Here is the idea:
                         -----------
                        |    GbE    |   
PCs ------------| switch |------------ | | ----------- | | ----------------- ----------------- ---| Workgoup Switch | | CentOS/Asterisk | | ----------------- ----------------- | | ----------- | VoIP ------------| GbE |------------ Phones | switch | -----------

How would I acomplish this? Can I use IP addresses from one IP network (say 10.0.0.0/24) to assign to the 2 LAN ports on the CentOS server and a port on each of the GbE switches and then use something like OSPF on the switches and the CentOS box to do the routing? Any other ideas?

Many thanks for your suggestions.

Regards, Patrick

Hi Patrick,

I run a HA(High Availability) technique that I developed myself whereas 2 servers are redundant syncing up bout every 15-20 seconds. Basically mine is IP oriented instead of machine oriented...either machine can be the master and will stay the master until the slave deems the master as having problems and not being able to handle the server responsibilities. When the slave decides this, it then grabs the roaming IP and turns on the servers and it is then the master.

If a redundant HA server is machine oriented, that means that, yes, the slave will take over but as soon as the master comes back on line, the master takes back the roaming ip and starts up the servers and the slave machine will assume slave responsibilities again.

these are just some ideas of what you are looking for. Also, google around using High Availability as a keyword...a bunch out there I think, I am just about to the point to start building up my web-site where I will cover my system in depth.

Incidently, over the last month or so, the main server seems to loose it and decides to reboot in which case, the slave takes over...I have never been around when it actually happened and sometimes I never even realized it for a day or so. It's kinda seamless and the switch takes bout 20 seconds. My point, I do know it works as it should!

John Rose

Patrick

5:27 p.m.

On Fri, 2005-08-26 at 11:02 -0500, rado wrote:

...

Hi Patrick,

I run a HA(High Availability) technique that I developed myself whereas 2 servers are redundant syncing up bout every 15-20 seconds. Basically mine is IP oriented instead of machine oriented...either machine can be the master and will stay the master until the slave deems the master as having problems and not being able to handle the server responsibilities. When the slave decides this, it then grabs the roaming IP and turns on the servers and it is then the master.

If a redundant HA server is machine oriented, that means that, yes, the slave will take over but as soon as the master comes back on line, the master takes back the roaming ip and starts up the servers and the slave machine will assume slave responsibilities again.

these are just some ideas of what you are looking for. Also, google around using High Availability as a keyword...a bunch out there I think, I am just about to the point to start building up my web-site where I will cover my system in depth.

Incidently, over the last month or so, the main server seems to loose it and decides to reboot in which case, the slave takes over...I have never been around when it actually happened and sometimes I never even realized it for a day or so. It's kinda seamless and the switch takes bout 20 seconds. My point, I do know it works as it should!

Hi John,

Thanks for your suggestion. The 20 seconds is a bit long for the telco service being out of order but I will further investigate the HA stuff over at linux-ha.org to see if it can be tweaked.

Regards, Patrick

rado

6:20 p.m.

On Fri, 2005-08-26 at 19:27 +0200, Patrick wrote:

...

On Fri, 2005-08-26 at 11:02 -0500, rado wrote:

...
Hi Patrick,

I run a HA(High Availability) technique that I developed myself whereas 2 servers are redundant syncing up bout every 15-20 seconds. Basically mine is IP oriented instead of machine oriented...either machine can be the master and will stay the master until the slave deems the master as having problems and not being able to handle the server responsibilities. When the slave decides this, it then grabs the roaming IP and turns on the servers and it is then the master.

If a redundant HA server is machine oriented, that means that, yes, the slave will take over but as soon as the master comes back on line, the master takes back the roaming ip and starts up the servers and the slave machine will assume slave responsibilities again.

these are just some ideas of what you are looking for. Also, google around using High Availability as a keyword...a bunch out there I think, I am just about to the point to start building up my web-site where I will cover my system in depth.

Incidently, over the last month or so, the main server seems to loose it and decides to reboot in which case, the slave takes over...I have never been around when it actually happened and sometimes I never even realized it for a day or so. It's kinda seamless and the switch takes bout 20 seconds. My point, I do know it works as it should!

Hi John,

Thanks for your suggestion. The 20 seconds is a bit long for the telco service being out of order but I will further investigate the HA stuff over at linux-ha.org to see if it can be tweaked.

Regards, Patrick

I hear ya Patrick, the only reason it is bout 20 secs is because I cannot just flush the cache in this bs zoom router...I have to reboot it in the code...if I had a router that I could flush the cache I could do it in about 5-6 secs which is much better but, hey, I don't really need it, I don't have bunches goin on and I just do it to do it.

John

...

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Scot L. Harris

4:04 p.m.

On Fri, 2005-08-26 at 09:36, Patrick wrote:

...

Hi all,

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). Here is the idea:
                         -----------
                        |    GbE    |   
PCs ------------| switch |------------ | | ----------- | | ----------------- ----------------- ---| Workgoup Switch | | CentOS/Asterisk | | ----------------- ----------------- | | ----------- | VoIP ------------| GbE |------------ Phones | switch | -----------

How would I acomplish this? Can I use IP addresses from one IP network (say 10.0.0.0/24) to assign to the 2 LAN ports on the CentOS server and a port on each of the GbE switches and then use something like OSPF on the switches and the CentOS box to do the routing? Any other ideas?

Many thanks for your suggestions.

The setup you describe has several single points of failure. Are the GbE switches you are using that fragile and likely to fail?

In the network you describe above the workgroup switch and the Asterisk box are single points of failure. If you want a redundant system then you need to eliminate the single points of failure. You may want to look at using HSRP or VRRP (HSRP is Cisco specific, VRRP is more generic) for HA type network solutions.

For the server you will need to look at cluster solutions.

As others have mentioned you can try bonding the interfaces on the server to provide higher bandwidth but I believe you need to have a switch that understands bonding as well.

When designing for redundancy and high availability start by identifying the critical parts of your infrastructure and determine the type of disasters you want to protect against as well as the likely hood of such a disaster. Many things while possible are unlikely or have little or now impact. Concentrate on those things that are likely to happen and have major impact to your systems.

And remember that adding more hardware or making your network more complex can sometimes increase the likely hood of having a failure cause service interruptions.

Depending on the costs of taking an outage you may be better off having a cold spare handy to replace the switch or device that fails.

Patrick

5:23 p.m.

On Fri, 2005-08-26 at 12:04 -0400, Scot L. Harris wrote:

...

On Fri, 2005-08-26 at 09:36, Patrick wrote:

...
Hi all,

I am trying to come up with an architecture that has some redundancy. The idea is to hook up the two GbE LAN interfaces of a CentOS server to two Gigabit Ethernet switches. In case one switch goes down, there is a redundant path (the server is redundant too). Here is the idea:
                         -----------
                        |    GbE    |   
PCs ------------| switch |------------ | | ----------- | | ----------------- ----------------- ---| Workgoup Switch | | CentOS/Asterisk | | ----------------- ----------------- | | ----------- | VoIP ------------| GbE |------------ Phones | switch | -----------

How would I acomplish this? Can I use IP addresses from one IP network (say 10.0.0.0/24) to assign to the 2 LAN ports on the CentOS server and a port on each of the GbE switches and then use something like OSPF on the switches and the CentOS box to do the routing? Any other ideas?

Many thanks for your suggestions.
The setup you describe has several single points of failure. Are the GbE switches you are using that fragile and likely to fail?

In the network you describe above the workgroup switch and the Asterisk box are single points of failure. If you want a redundant system then you need to eliminate the single points of failure. You may want to look at using HSRP or VRRP (HSRP is Cisco specific, VRRP is more generic) for HA type network solutions.

Yes the workgroup switch is a SPoF but there are cold spares in case the active one blows up. And there is room to add a 2nd workgroup switch and use HSRP to cover that SPoF. Also, there is a second active Asterisk box but for simplicity I left it out of the picture so that's not a SPoF.

...

For the server you will need to look at cluster solutions.

Afaik VoIP servers can not be clustered. The reason being (I think): Once a call is active it has a certain path & interaction with opened UDP/RTP ports and with Asterisk on one or more boxes. If that box goes down the call can not be rerouted realtime through another Asterisk box in the cluster because the 2nd Asterisk box did not know the call existed in the first place (the SIP call setup part is missing, RTP ports are closed etc.). I'd love to hear the opposite is true (and some pointers how to do this :) It might be possible that the Asterisk Realtime Architecture (ARA) can do something to solve this but I would need to investigate if that's the case.

...

As others have mentioned you can try bonding the interfaces on the server to provide higher bandwidth but I believe you need to have a switch that understands bonding as well.

Totally agree. I think they use Cisco kit so I guess it would be a 3560G-24TS which is a relatively new model with current IOS.

...

When designing for redundancy and high availability start by identifying the critical parts of your infrastructure and determine the type of disasters you want to protect against as well as the likely hood of such a disaster. Many things while possible are unlikely or have little or now impact. Concentrate on those things that are likely to happen and have major impact to your systems.

Sure. Powersupplies, fans and harddisks will all fail at some point and must be available 1+1 and be hot swappable. Then there are cables, ethernet ports and Gbics in core switches that can fail so must also be available in a redundant fashion. Telco Interface cards (E1/PRI), can also fail so must also be available in a redundant fashion and there is off course also room in the rack for a few E1/PRI failover switches. On a software level everything is redundant (dns, smtp, www, ntp, syslog, asterisk, postgresql etc.). Afaict these are the things that are likely to happen or if they happen there better be redundancy or some critical services go down.

...

And remember that adding more hardware or making your network more complex can sometimes increase the likely hood of having a failure cause service interruptions.

Agree but sometimes the application requires you to go a long way.

...

Depending on the costs of taking an outage you may be better off having a cold spare handy to replace the switch or device that fails.

The organization has simply decided there shall not be an outage of the service (which means indivual parts can blow up as long as the service remains up) so the cost of adding redundancy till you drop is not an issue. Obviously, next to the active redundancy, we could always add a few cold spares :)

Thanks for your comments and suggestions.

Regards, Patrick

Scot L. Harris

7:11 p.m.

On Fri, 2005-08-26 at 13:23, Patrick wrote:

...

On Fri, 2005-08-26 at 12:04 -0400, Scot L. Harris wrote:

...

...

...
Depending on the costs of taking an outage you may be better off having a cold spare handy to replace the switch or device that fails.

The organization has simply decided there shall not be an outage of the service (which means indivual parts can blow up as long as the service remains up) so the cost of adding redundancy till you drop is not an issue. Obviously, next to the active redundancy, we could always add a few cold spares :)

Thanks for your comments and suggestions.

That is unusual. :) Most of the time after designing a gold plated redundant system with no single points of failure the customers look at the cost and decide that they don't need things quite that bullet proof. :)

To achieve zero down time for the service you will need resolve that clustering issue with the PBX software. As you indicate that is going to be difficult. The closest I came to something like that was some Checkpoint firewalls I had setup in a VRRP configuration. They shared the tables listing the connections being routed through them so if one rolled over and the other took over the connections in theory would not have to be reestablished through the backup firewall.

Hopefully the asterisk software has a feature that will handle that for you. The other parts of the network can be built in a redundant mode.

Good luck!

rado

7:27 p.m.

On Fri, 2005-08-26 at 15:11 -0400, Scot L. Harris wrote:

...

On Fri, 2005-08-26 at 13:23, Patrick wrote:

...
On Fri, 2005-08-26 at 12:04 -0400, Scot L. Harris wrote:

...
...
...
Depending on the costs of taking an outage you may be better off having a cold spare handy to replace the switch or device that fails.

The organization has simply decided there shall not be an outage of the service (which means indivual parts can blow up as long as the service remains up) so the cost of adding redundancy till you drop is not an issue. Obviously, next to the active redundancy, we could always add a few cold spares :)

Thanks for your comments and suggestions.

That is unusual. :) Most of the time after designing a gold plated redundant system with no single points of failure the customers look at the cost and decide that they don't need things quite that bullet proof. :)

exactly!!!, Scott and then if you want real-time sync?? ummm well, you better bring your wallet!

...

To achieve zero down time for the service you will need resolve that clustering issue with the PBX software. As you indicate that is going to be difficult. The closest I came to something like that was some Checkpoint firewalls I had setup in a VRRP configuration. They shared the tables listing the connections being routed through them so if one rolled over and the other took over the connections in theory would not have to be reestablished through the backup firewall.

Hopefully the asterisk software has a feature that will handle that for you. The other parts of the network can be built in a redundant mode.

Good luck!

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

7426

Age (days ago)

7426

Last active (days ago)

discuss@lists.centos.org

11 comments

5 participants

tags (0)

participants (5)

Bryan J. Smith
Chris Mauritz
Patrick
rado
Scot L. Harris