My apologies if I'm posting the wrong place, or am asking a common question. All my looking so far hasn't turned up anything very useful in knowing what to look at, or what to modify.
--- CentOS 5, running BIND 9.3.6 i386
Hardware: P4, 2.8Ghz, 1G memory Sata drives - non mirrored etc.
Load is light, usually under 0.1
-- This box is running Postfix as our mail server. BIND (9.3.6) [Latest.]
-- Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Early in the process, we get stuff like this: [We have a "successful" lookup, and then a failure...] --- Apr 14 14:25:05 mail postfix/smtpd[22281]: NOQUEUE: reject: RCPT from bzq-79-183-5-119.red.bezeqint.net[79.183.5.119]: 554 5.7.1 Service unavailable; Client host [79.183.5.119] blocked using zen.spamhaus.org; from=contriveclaudia@royalmoore.com to=contriveclaudia@royalmoore.com proto=SMTP helo=<bzq-79-183-5-119.red.bezeqint.net>
Apr 14 14:25:07 mail postfix/smtpd[22804]: warning: 33.229.242.205.zen.spamhaus.org: RBL lookup error: Host or domain name not found. Name service error for name=33.229.242.205.zen.spamhaus.org type=A: Host not found, try again --- As you can see, we had a lookup succeed and then just right after, one fail - claiming it got no answer from BIND. I get others after this that SUCCEED - so it's not in 100% failure mode yet. After time [how much, I don't know] eventually all the zen queries [or most all] fail.
A bind restart fixes the problem. [Hmmm...]
--- I do some logging in bind, and I don't see any reason for them to fail. Here's a bind debug log of 5 on that failure above.
--- 14-Apr-2010 14:24:57.654 queries: info: client 127.0.0.1#42018: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:24:57.654 security: debug 3: client 127.0.0.1#42018: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:24:57.654 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:24:57.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): create 14-Apr-2010 14:24:57.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:24:57.654 resolver: debug 3: fetch 0x94a0b20 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): start 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): getaddresses 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:01.658 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:02.653 queries: info: client 127.0.0.1#42018: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:25:02.653 security: debug 3: client 127.0.0.1#42018: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:25:02.654 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:25:02.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:25:02.654 resolver: debug 3: fetch 0x94d54c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:07.664 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:08.393 queries: info: client 127.0.0.1#50896: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:25:08.393 security: debug 3: client 127.0.0.1#50896: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:25:08.393 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:25:08.393 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:25:08.393 resolver: debug 3: fetch 0x94ad9c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:13.669 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:19.674 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:23.677 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:23.912 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): response 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): noanswer_response 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): ncache_message 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): clone_results 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): done 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): stopeverything 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:25:23.913 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): sendevents 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94a0b20 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94d54c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94ad9c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): shutdown 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): doshutdown 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): stopeverything 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): destroy --- As far as I can see, the BIND query simply didn't get a response - but no errors?
--- First, someone's going to ask - perhaps Zen's blocking you. I don't think so. Here's why. -We're non-commercial, using the definition set my spamhaus, -mail connects TOTAL are well less than 100K a day. (Less than 10K in actuality) -and thus having more than 300K queries is pretty unlikely.
-Also, let me remind you that a restart of the bind service seems to make the failures go away for a while, so if zen were blocking our queries, I'd think that wouldn't make a difference.
--- I certainly suspect a problem with BIND, but I can't find it, and have no idea where to go from here. I simply don't know where to look any more. If BIND were having a problem, say allocating memory, or something, shouldn't it be in a debug level 5 log?
HELP!
-Greg
[Cross posted from elsewhere, but I'm at a dead-end right now, so looking for a steer in where to ask, or direct help! TIA]
On Wed, Apr 14, 2010 at 7:36 PM, listserv.traffic@sloop.net wrote:
First, someone's going to ask - perhaps Zen's blocking you. I don't think so. Here's why. -We're non-commercial, using the definition set my spamhaus, -mail connects TOTAL are well less than 100K a day. (Less than 10K in actuality) -and thus having more than 300K queries is pretty unlikely.
I'm not privy to spamhaus.org's rate limiting policies, but you show two queries 2 seconds apart, or 86400/2 per day perhaps.
-Also, let me remind you that a restart of the bind service seems to make the failures go away for a while, so if zen were blocking our queries, I'd think that wouldn't make a difference.
Read and understood.
I certainly suspect a problem with BIND, but I can't find it, and have no idea where to go from here. I simply don't know where to look any more. If BIND were having a problem, say allocating memory, or something, shouldn't it be in a debug level 5 log?
Perhaps using 208.67.220.220 and 208.67.222.222 as resolvers or directly asking spamhaus.org if they are rate limiting you would help.
kind regards/ldv
listserv.traffic@sloop.net wrote:
My apologies if I'm posting the wrong place, or am asking a common question. All my looking so far hasn't turned up anything very useful in knowing what to look at, or what to modify.
CentOS 5, running BIND 9.3.6 i386
Hardware: P4, 2.8Ghz, 1G memory Sata drives - non mirrored etc.
Load is light, usually under 0.1
-- This box is running Postfix as our mail server. BIND (9.3.6) [Latest.]
-- Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Early in the process, we get stuff like this: [We have a "successful" lookup, and then a failure...]
Apr 14 14:25:05 mail postfix/smtpd[22281]: NOQUEUE: reject: RCPT from bzq-79-183-5-119.red.bezeqint.net[79.183.5.119]: 554 5.7.1 Service unavailable; Client host [79.183.5.119] blocked using zen.spamhaus.org; from=contriveclaudia@royalmoore.com to=contriveclaudia@royalmoore.com proto=SMTP helo=<bzq-79-183-5-119.red.bezeqint.net>
Apr 14 14:25:07 mail postfix/smtpd[22804]: warning: 33.229.242.205.zen.spamhaus.org: RBL lookup error: Host or domain name not found. Name service error for name=33.229.242.205.zen.spamhaus.org type=A: Host not found, try again
As you can see, we had a lookup succeed and then just right after, one fail - claiming it got no answer from BIND. I get others after this that SUCCEED - so it's not in 100% failure mode yet. After time [how much, I don't know] eventually all the zen queries [or most all] fail.
A bind restart fixes the problem. [Hmmm...]
Check out the following bug report. I would also look at other bind bug reports. My sense is that redhat has deviated quite a bite from the ISC version of bind. In particular I believe that they disabled or otherwise modified the caching behavior back about 6-8 months ago when there were major security issues with bind. I have felt that my Red Hat/Centos name servers have not worked as well as Fedora or ISC bind name servers since this time. You might try installing ISC bind and see if that solves your problem.
https://bugzilla.redhat.com/show_bug.cgi?id=553334
Nataraj
I do some logging in bind, and I don't see any reason for them to fail. Here's a bind debug log of 5 on that failure above.
14-Apr-2010 14:24:57.654 queries: info: client 127.0.0.1#42018: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:24:57.654 security: debug 3: client 127.0.0.1#42018: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:24:57.654 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:24:57.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): create 14-Apr-2010 14:24:57.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:24:57.654 resolver: debug 3: fetch 0x94a0b20 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): start 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): getaddresses 14-Apr-2010 14:24:57.655 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:24:57.655 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:24:59.657 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:24:59.657 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:01.658 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:01.658 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:01.659 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:02.653 queries: info: client 127.0.0.1#42018: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:25:02.653 security: debug 3: client 127.0.0.1#42018: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:25:02.654 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:25:02.654 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:25:02.654 resolver: debug 3: fetch 0x94d54c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:03.660 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:03.660 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:05.663 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:05.663 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:07.664 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:07.664 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:07.665 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:08.393 queries: info: client 127.0.0.1#50896: query: 33.229.242.205.zen.spamhaus.org IN A + 14-Apr-2010 14:25:08.393 security: debug 3: client 127.0.0.1#50896: query (cache) '33.229.242.205.zen.spamhaus.org/A/IN' approved 14-Apr-2010 14:25:08.393 resolver: debug 1: createfetch: 33.229.242.205.zen.spamhaus.org A 14-Apr-2010 14:25:08.393 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): join 14-Apr-2010 14:25:08.393 resolver: debug 3: fetch 0x94ad9c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): created 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:09.666 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:09.666 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:11.668 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:11.668 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:13.669 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:13.669 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:13.670 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:15.671 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:15.671 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:17.673 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:17.673 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:19.674 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:19.674 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:19.675 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:21.676 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:21.676 resolver: debug 3: resquery 0x940ec38 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:23.677 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): timeout 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): try 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): query 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): send 14-Apr-2010 14:25:23.678 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): too many timeouts, disabling EDNS0 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): sent 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): udpconnected 14-Apr-2010 14:25:23.678 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): senddone 14-Apr-2010 14:25:23.912 resolver: debug 3: resquery 0x94ae7b8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): response 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): noanswer_response 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): ncache_message 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): clone_results 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelquery 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): done 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): stopeverything 14-Apr-2010 14:25:23.912 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:25:23.913 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): sendevents 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94a0b20 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94d54c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.913 resolver: debug 3: fetch 0x94ad9c8 (fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A)): destroyfetch 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): shutdown 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): doshutdown 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): stopeverything 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): cancelqueries 14-Apr-2010 14:25:23.914 resolver: debug 3: fctx 0x932e140(33.229.242.205.zen.spamhaus.org/A'): destroy
As far as I can see, the BIND query simply didn't get a response - but no errors?
First, someone's going to ask - perhaps Zen's blocking you. I don't think so. Here's why. -We're non-commercial, using the definition set my spamhaus, -mail connects TOTAL are well less than 100K a day. (Less than 10K in actuality) -and thus having more than 300K queries is pretty unlikely.
-Also, let me remind you that a restart of the bind service seems to make the failures go away for a while, so if zen were blocking our queries, I'd think that wouldn't make a difference.
I certainly suspect a problem with BIND, but I can't find it, and have no idea where to go from here. I simply don't know where to look any more. If BIND were having a problem, say allocating memory, or something, shouldn't it be in a debug level 5 log?
HELP!
-Greg
[Cross posted from elsewhere, but I'm at a dead-end right now, so looking for a steer in where to ask, or direct help! TIA]
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Check out the following bug report. I would also look at other bind bug reports. My sense is that redhat has deviated quite a bite from the ISC version of bind. In particular I believe that they disabled or otherwise modified the caching behavior back about 6-8 months ago when there were major security issues with bind. I have felt that my Red Hat/Centos name servers have not worked as well as Fedora or ISC bind name servers since this time. You might try installing ISC bind and see if that solves your problem.
Nataraj
Interesting - though in our case it's failing long before a few million lookups. I don't much relish compiling ISC versions to run on my box - the security implications and other hassles don't seem trivial. [We don't allow external [the world] lookups - just local "trusted" users, but that only mitigates some of the security concerns.]
Perhaps it's possible to use an older version that's security patched. Ugh.
-Greg
listserv.traffic@sloop.net wrote:
Check out the following bug report. I would also look at other bind bug reports. My sense is that redhat has deviated quite a bite from the ISC version of bind. In particular I believe that they disabled or otherwise modified the caching behavior back about 6-8 months ago when there were major security issues with bind. I have felt that my Red Hat/Centos name servers have not worked as well as Fedora or ISC bind name servers since this time. You might try installing ISC bind and see if that solves your problem.
Nataraj
Interesting - though in our case it's failing long before a few million lookups. I don't much relish compiling ISC versions to run on my box - the security implications and other hassles don't seem trivial. [We don't allow external [the world] lookups - just local "trusted" users, but that only mitigates some of the security concerns.]
Perhaps it's possible to use an older version that's security patched. Ugh.
Though I have not done it in a while, It's not a big deal to build ISC bind. If you have compilers installed, you untar it and run "make" or "make install", maybe setting up the path for installation. With the security issues today, I often run a separate system for name servers (actually I use virtual machines). In fact, mostly I setup both an internal and a external nameserver where the internal one forwards queries to the external one so it never receives packets from the Internet. So the internal one could be on your mail server and the external one could be a seperate box. For test purposes, you could try ISC bind on any old box just to determine if it solves the problem.
Alternatively, if the problem is urgent I guess you could buy a red hat license and try to get them to up the priority on resolving this. If you have the time and skills, you could install a debug compiled version of CentOS bind and try to either debug it or capture a dump of it when it breaks and submit that to developers.
I don't think running ISC bind for a short time is a major risk. It's quite widely deployed in the field.
Nataraj
-Greg
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
What happens if you change your resolv.conf to google's dns ?
On 4/15/10, Nataraj incoming-centos@rjl.com wrote:
listserv.traffic@sloop.net wrote:
Check out the following bug report. I would also look at other bind bug reports. My sense is that redhat has deviated quite a bite from the ISC version of bind. In particular I believe that they disabled or otherwise modified the caching behavior back about 6-8 months ago when there were major security issues with bind. I have felt that my Red Hat/Centos name servers have not worked as well as Fedora or ISC bind name servers since this time. You might try installing ISC bind and see if that solves your problem.
Nataraj
Interesting - though in our case it's failing long before a few million lookups. I don't much relish compiling ISC versions to run on my box - the security implications and other hassles don't seem trivial. [We don't allow external [the world] lookups - just local "trusted" users, but that only mitigates some of the security concerns.]
Perhaps it's possible to use an older version that's security patched. Ugh.
Though I have not done it in a while, It's not a big deal to build ISC bind. If you have compilers installed, you untar it and run "make" or "make install", maybe setting up the path for installation. With the security issues today, I often run a separate system for name servers (actually I use virtual machines). In fact, mostly I setup both an internal and a external nameserver where the internal one forwards queries to the external one so it never receives packets from the Internet. So the internal one could be on your mail server and the external one could be a seperate box. For test purposes, you could try ISC bind on any old box just to determine if it solves the problem.
Alternatively, if the problem is urgent I guess you could buy a red hat license and try to get them to up the priority on resolving this. If you have the time and skills, you could install a debug compiled version of CentOS bind and try to either debug it or capture a dump of it when it breaks and submit that to developers.
I don't think running ISC bind for a short time is a major risk. It's quite widely deployed in the field.
Nataraj
-Greg
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
What happens if you change your resolv.conf to google's dns ?
I haven't tried this, but from reports, spamhaus.org blocks google's dns. [The traffic limits are too high. If they didn't, no one would buy a commercial zone transfer license...]
So, while it's not likely to fix this problem, even if it were, it seems like your "solution" to the broken DNS server is to use someone else's DNS server.
So, yeah, I could drive the neighbor's car when mine doesn't work. But that doesn't fix my car.
I'm interested in fixing mine, or at least understanding how and why it's broken.
Thanks for your time and thoughts though.
-Greg
On 4/15/10, Nataraj incoming-centos@rjl.com wrote:
listserv.traffic@sloop.net wrote:
Check out the following bug report. I would also look at other bind bug reports. My sense is that redhat has deviated quite a bite from the ISC version of bind. In particular I believe that they disabled or otherwise modified the caching behavior back about 6-8 months ago when there were major security issues with bind. I have felt that my Red Hat/Centos name servers have not worked as well as Fedora or ISC bind name servers since this time. You might try installing ISC bind and see if that solves your problem.
Nataraj
Interesting - though in our case it's failing long before a few million lookups. I don't much relish compiling ISC versions to run on my box - the security implications and other hassles don't seem trivial. [We don't allow external [the world] lookups - just local "trusted" users, but that only mitigates some of the security concerns.]
Perhaps it's possible to use an older version that's security patched. Ugh.
Though I have not done it in a while, It's not a big deal to build ISC bind. If you have compilers installed, you untar it and run "make" or "make install", maybe setting up the path for installation. With the security issues today, I often run a separate system for name servers (actually I use virtual machines). In fact, mostly I setup both an internal and a external nameserver where the internal one forwards queries to the external one so it never receives packets from the Internet. So the internal one could be on your mail server and the external one could be a seperate box. For test purposes, you could try ISC bind on any old box just to determine if it solves the problem.
Alternatively, if the problem is urgent I guess you could buy a red hat license and try to get them to up the priority on resolving this. If you have the time and skills, you could install a debug compiled version of CentOS bind and try to either debug it or capture a dump of it when it breaks and submit that to developers.
I don't think running ISC bind for a short time is a major risk. It's quite widely deployed in the field.
Nataraj
listserv.traffic@sloop.net wrote:
What happens if you change your resolv.conf to google's dns ?
I haven't tried this, but from reports, spamhaus.org blocks google's dns. [The traffic limits are too high. If they didn't, no one would buy a commercial zone transfer license...]
So, while it's not likely to fix this problem, even if it were, it seems like your "solution" to the broken DNS server is to use someone else's DNS server.
So, yeah, I could drive the neighbor's car when mine doesn't work. But that doesn't fix my car.
I'm interested in fixing mine, or at least understanding how and why it's broken.
Thanks for your time and thoughts though.
I think the point was to test using a different DNS in order to verify that the problem is with your DNS server before you spend a lot of time on it.
Unfortunately, this is not going to work in your case for the stated reasons.
On 4/15/2010 3:00 PM, listserv.traffic@sloop.net wrote:
What happens if you change your resolv.conf to google's dns ?
I haven't tried this, but from reports, spamhaus.org blocks google's dns. [The traffic limits are too high. If they didn't, no one would buy a commercial zone transfer license...]
So, while it's not likely to fix this problem, even if it were, it seems like your "solution" to the broken DNS server is to use someone else's DNS server.
So, yeah, I could drive the neighbor's car when mine doesn't work. But that doesn't fix my car.
I'm interested in fixing mine, or at least understanding how and why it's broken.
Did you try reverting to bind-9.3.4-10.P1.el5_3.3 per the RH bug? Or can you temporarily run some other distribution (ubuntu/fedora, etc.) as a resolver or forwarder?
sys Admin wrote:
What happens if you change your resolv.conf to google's dns ?
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
kind regards/ldv
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
Indeed, but not if you are charging for high volume and/or commercial use.
On Thu, Apr 15, 2010 at 3:18 PM, Ned Slider ned@unixmail.co.uk wrote:
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
Indeed, but not if you are charging for high volume and/or commercial use.
opendns resolves queries to zen.spamhaus.org and AFAIK all the major DNSBLs. Period. End.
kind regards/ldv
on 4-15-2010 1:36 PM Larry Vaden spake the following:
On Thu, Apr 15, 2010 at 3:18 PM, Ned Slider ned@unixmail.co.uk wrote:
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
Indeed, but not if you are charging for high volume and/or commercial use.
opendns resolves queries to zen.spamhaus.org and AFAIK all the major DNSBLs. Period. End.
kind regards/ldv
Resolves them, or forwards them? Just curious...
On Thu, Apr 15, 2010 at 3:53 PM, Scott Silva ssilva@sgvwater.com wrote:
on 4-15-2010 1:36 PM Larry Vaden spake the following:
On Thu, Apr 15, 2010 at 3:18 PM, Ned Slider ned@unixmail.co.uk wrote:
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
Indeed, but not if you are charging for high volume and/or commercial use.
opendns resolves queries to zen.spamhaus.org and AFAIK all the major DNSBLs. Period. End.
kind regards/ldv
Resolves them, or forwards them? Just curious...
Avoiding answering your question because of lack of expertise in the difference of resolving vs. forwarding, but (IP taken from a recent (Apr 15 16:01:25 CT) postfix NOQUEUE):
[redacted@catch22 etc]# host 251.54.51.173.zen.spamhaus.org 208.67.222.222 Using domain server: Name: 208.67.222.222 Address: 208.67.222.222#53 Aliases:
251.54.51.173.zen.spamhaus.org has address 127.0.0.10 251.54.51.173.zen.spamhaus.org has address 127.0.0.4 [redacted@catch22 etc]#
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:53 PM, Scott Silva ssilva@sgvwater.com wrote:
on 4-15-2010 1:36 PM Larry Vaden spake the following:
On Thu, Apr 15, 2010 at 3:18 PM, Ned Slider ned@unixmail.co.uk wrote:
Larry Vaden wrote:
On Thu, Apr 15, 2010 at 3:03 PM, Ned Slider ned@unixmail.co.uk wrote:
Changing dns to public services such as google or OpenDNS is not going to help as DNSBLs like Spamhaus will have blocked access by these services. Otherwise it would be simple to avoid paying for (business) access to Spamhaus.
Au contraire, there are benefits/economies of scale to spamhaus.org from having an aggregator like opendns.
Indeed, but not if you are charging for high volume and/or commercial use.
opendns resolves queries to zen.spamhaus.org and AFAIK all the major DNSBLs. Period. End.
kind regards/ldv
Resolves them, or forwards them? Just curious...
Avoiding answering your question because of lack of expertise in the difference of resolving vs. forwarding, but (IP taken from a recent (Apr 15 16:01:25 CT) postfix NOQUEUE):
[redacted@catch22 etc]# host 251.54.51.173.zen.spamhaus.org 208.67.222.222 Using domain server: Name: 208.67.222.222 Address: 208.67.222.222#53 Aliases:
251.54.51.173.zen.spamhaus.org has address 127.0.0.10 251.54.51.173.zen.spamhaus.org has address 127.0.0.4
Well adding the -a option to host shows that the answer is not authoritative, so the query is being forwarded.
host -a 251.54.51.173.zen.spamhaus.org 208.67.222.222
Trying "251.54.51.173.zen.spamhaus.org"
Using domain server:
Name: 208.67.222.222
Address: 208.67.222.222#53
Aliases:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23703
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;251.54.51.173.zen.spamhaus.org. IN ANY ;; ANSWER SECTION: 251.54.51.173.zen.spamhaus.org. 893 IN A 127.0.0.10 251.54.51.173.zen.spamhaus.org. 893 IN A 127.0.0.4
[redacted@catch22 etc]# _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, 2010-04-14 at 17:36 -0700, listserv.traffic@sloop.net wrote:
-- Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Does your network interface show any abnormalities - dropped packets etc? I assume you have no local ratelimiting (via iptables etc)?
John.
On Wed, 2010-04-14 at 17:36 -0700, listserv.traffic@sloop.net wrote:
-- Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Does your network interface show any abnormalities - dropped packets etc? I assume you have no local ratelimiting (via iptables etc)?
No rate limiting, and ifconfig isn't showing any errors/drops/overruns etc.
-Greg
listserv.traffic@sloop.net ha scritto:
Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Just some toughs: you could try to install rbldnsd.i386 from rpmforge repo for caching rbl lookups
I certainly suspect a problem with BIND, but I can't find it, and have no idea where to go from here.
Or try to use dnsmasq (from base) to see if the problem really is with BIND
Ciao Lorenzo
Recap of config (There's a "New" section below that covers new data...)
--- Current config:
CentOS 5, running BIND 9.3.6
*** (We updated everything to most recent versions when this was initially posted, mid April, and it made no difference in the symptoms.)
i386
Hardware: P4, 2.8Ghz, 1G memory Sata drives - non mirrored etc.
Load is light, usually under 0.1
-- This box is running Postfix as our mail server. BIND (9.3.6) [Latest.]
-- Problem: Postfix is doing RBL lookups on zen.spamhaus.org. Everything goes along groovy - but then lookups start failing.
Early in the process, we get stuff like this: [We have a "successful" lookup, and then a failure...] --- Apr 14 14:25:05 mail postfix/smtpd[22281]: NOQUEUE: reject: RCPT from bzq-79-183-5-119.red.bezeqint.net[79.183.5.119]: 554 5.7.1 Service unavailable; Client host [79.183.5.119] blocked using zen.spamhaus.org; from=xxx to=yyy proto=SMTP helo=<bzq-79-183-5-119.red.bezeqint.net>
Apr 14 14:25:07 mail postfix/smtpd[22804]: warning: 33.229.242.205.zen.spamhaus.org: RBL lookup error: Host or domain name not found. Name service error for name=33.229.242.205.zen.spamhaus.org type=A: Host not found, try again --- As you can see, we had a lookup succeed and then just right after, one fail - claiming it got no answer from BIND. I get others after this that SUCCEED - so it's not in 100% failure mode yet.
After time eventually all the zen queries [or most all] fail. [It appears as though after around 4 hours, most all queries to zen are failing.]
A bind restart fixes the problem. [Hmmm...] ---
First, someone's going to ask - perhaps Zen's blocking you. I don't think so. Here's why. -We're non-commercial, using the definition set my spamhaus, -mail connects TOTAL are well less than 100K a day. (Less than 10K in actuality) -and thus having more than 300K queries is pretty unlikely. -Also, let me remind you that a restart of the bind service seems to make the failures go away for a while, so if zen were blocking our queries, I'd think that wouldn't make a difference.
[Also, from the updates below, we can run an alternate distro as a dedicated DNS box, and it queries zen just fine. So, we're NOT being rate limited.]
--- I certainly suspect a problem with BIND, but I can't find it, and have no idea where to go from here. I simply don't know where to look any more. If BIND were having a problem, say allocating memory, or something, shouldn't it be in a debug level 5 log?
===== New information:
Tried running a separate DNS box on Fedora 12 - again with all the current patches.
We then point the DNS server on the postfix box at our stand-alone Fedora 12 box.
The exact same symptoms occur on the FC12 box.
--- Next, tried a Ubuntu box also running the latest patches and pointed the Postix box there. Problem solved - or at least mostly so. [We still get around a 2% failure rate - timeouts - but it is always quite low, and stays at a constant level.]
So, as was suggested in this thread it appears to be a RH specific implementation bug.
I have a WAG that it might be related to UDP fragmentation on DNSSec packets - but I have no idea if that's realistic or not. [Part of why I lean this way is that this isn't reported widely as a problem, and so I'd assume it's a combination of effects bug - perhaps related to how our firewall passes fragmented UDP replies.]
I obviously have more testing to do, but I welcome any comments...
TIA -Greg
On Mon, May 17, 2010 at 11:40 PM, listserv.traffic@sloop.net wrote:
I obviously have more testing to do, but I welcome any comments...
I don't have any solution to your problem but ... I have seen something similar on a Debian box running a local BIND server. Repo is defined as "ftp.debian.org".
apt-get install <package name> gives error "unable to resolve ftp.debian.org" but "host ftp.debian.org localhost" gives the IP number of the server. File /etc/resolv.conf lists 127.0.0.1 as the first name server.
-- Arun Khan