Joseph L. Casale wrote:
Now it seems there is still one last hurdle, when the connection is yanked to simulate a complete outage, asterisk still goes down. I can only assume this happens now as a result of no default gateway?
check the logs? run strace on the process? run tcpdump on the interface(s) to see what traffic it is trying to transmit?
Would setting up a silly route for 0.0.0.0/0 to say 127.0.0.1 for the internal nic in /etc/sysconfig/network-scripts/route-eth1 with a metric higher than what the ISP's dhcp servers default gw would be possibly cure this? My hope is that when the wan nic goes down, a route is still available.
Last I checked the 'metric' number in the linux routing table is really only used when your using a routing daemon. as far as default routes go, it should not have any impact.
I suspect the route is not the issue, I suspect that the app is trying to talk to something external and then fails, it will fail the same if you try to point it to a router that goes nowhere.
tcpdump should be able to tell you who the host is trying to talk to. strace might reveal why.
nate
I have almost this same setup running with no problems. Make sure you have only one default gateway on your server defined on your Internet facing interface. This should be getting assigned from the DHCP request to your ISP so make sure you don't have a gateway in your internal interface.
As far as Asterisk crashing, that sounds like application problem (like Nate said) trying to communicate over the connection that was pulled out. However, make sure it's listening on all interfaces (0.0.0.0), or just the internal static IP, so it's not specifically listening on the DHCP IP that could change or go away when the network cable is yanked.
A local tcpdump or wireshark should shed some light on this problem if the above doesn't change anything.
I have almost this same setup running with no problems. Make sure you have only one default gateway on your server defined on your Internet facing interface. This should be getting assigned from the DHCP request to your ISP so make sure you don't have a gateway in your internal interface.
As far as Asterisk crashing, that sounds like application problem (like Nate said) trying to communicate over the connection that was pulled out. However, make sure it's listening on all interfaces (0.0.0.0), or just the internal static IP, so it's not specifically listening on the DHCP IP that could change or go away when the network cable is yanked.
A local tcpdump or wireshark should shed some light on this problem if the above doesn't change anything.
That's interesting, binding it to the internal nics ip... Nate's post has me resolved to just go in on a Saturday and sit at the console running some tests. But with this in mind, I might be better armed for success.
I also found someone suggesting not to use the fqdn in the sip peer reg, as he had the same issues I do.
Lots of love for Asterisk, but I may resolve to learning Freeswitch as I am thinking the developer has fixed some long outstanding problems with Asterisk. The ast devs would argue a robust network is required, but core dumping when a dns query fails? WTF kind of code is that?
Thanks for the good idea!
jlc