Hi
I'm hosting a forum (phpbb) with more than 3500 users with centos 4.3. The forum admins would like to send an email to its (3500) users. I configured phpbb to use smtp (postfix-2.1.5-4.2.RHEL4.mysql.centos4) to send the email. I raised the max destination limit from 1000 to 5000, and made a test. Here the disaster began: Majority of the users received from 20 to 50 emails. Looking at the logs, postfix divided the email into several ones with 50 recipient each (as documented). I think the problem was that when one of the recipients was unreachable, the mail went into defered state, to be retried later. The problem (bug?) is that the mail was sent again to all 50 recipients, explaining why so many users received lot of mails. Anyone had a similar problem? Any suggestion for that?
Regards Sophana
sophana wrote:
I think the problem was that when one of
the recipients was unreachable, the mail went into defered state, to be retried later. The problem (bug?) is that the mail was sent again to all 50 recipients, explaining why so many users received lot of mails.
That looks like a problem on the *recipient* site. Normal processing of the mail should go ahead for all *reachable* adresses during the smtp dialogue, so the mail only gets deferred for the non-reachable recipients. On the other hand: If the recipient decides to *not* take the mail at all because one of the recipients is unreachable (or the limit of errors during the smtp handshake has been reached), you should never be able to reach the DATA phase in the smtp dialogue - so no mail gets sent at all.
If some mail gets sent, postfix *never* resends mails to those recipients who already got the mails.
But without seeing any logs everything's just a cloudy image in my crystal ball. It might also be that your setup is broken.
Ralph
Unfortunately, I did the test some weeks ago, and the logs have been rotated away... I don't think I'd like to bother 3500 users again with my mass emails... If you have a solution to test this, without real email...
Finally, I'm not so sure about the email being resent because of unreachable recipients. One thing I remember: In the postfix spool, in the defered dir, there were LOT of emails with different hash names. Looking in some of them, I noticed recipients duplicated in several of them, this should not be normal... In the logs, I could see some these hashed defered emails being repeated several times, each time with a cause why the mail was being defered.
There surely is something wrong in my setup. However I have thousands of mails per day being sent (one by one...) correctly from the same forum.
Regards
Ralph Angenendt a écrit :
sophana wrote:
I think the problem was that when one of
the recipients was unreachable, the mail went into defered state, to be retried later. The problem (bug?) is that the mail was sent again to all 50 recipients, explaining why so many users received lot of mails.
That looks like a problem on the *recipient* site. Normal processing of the mail should go ahead for all *reachable* adresses during the smtp dialogue, so the mail only gets deferred for the non-reachable recipients. On the other hand: If the recipient decides to *not* take the mail at all because one of the recipients is unreachable (or the limit of errors during the smtp handshake has been reached), you should never be able to reach the DATA phase in the smtp dialogue - so no mail gets sent at all.
If some mail gets sent, postfix *never* resends mails to those recipients who already got the mails.
But without seeing any logs everything's just a cloudy image in my crystal ball. It might also be that your setup is broken.
Ralph
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
sophana wrote:
Unfortunately, I did the test some weeks ago, and the logs have been rotated away... I don't think I'd like to bother 3500 users again with my mass emails... If you have a solution to test this, without real email...
If you're serious about fixing this, then you need to be able to reproduce the problem at will. How else can you tell whether the problem's fixed?
It's especially more difficult to solve this problem if the evidence is lost; if your memory's anything near as bad as mine, you've forgotten important information, confused it and/or never noticed in the first place:-)
You don't have to use real users, and you probably don't have to havr 3500 addresses, but you do need some that work and some that don't.
Assuming you have your own LAN, you could use an alternative domain name and configure two or three hosts to receive email for that domain name.
I use the TLD "lan" for my testing and anything the world at large shouldn't see: [summer@ns ~]$ host demo.lan demo.lan has address 192.168.9.1 [summer@ns ~]$ host test.lan test.lan has address 192.168.7.254 [summer@ns ~]$ host office.lan office.lan has address 192.168.1.252 [summer@ns ~]$
You don't need separate hardware, in my case I could overlay 10.1.1.0/24 on 192.168.9.0 by configuring eth0:0 on the various machines: sudo ifconfig eth0:0 10.1.0.146 netmask 255.255.255.0 and so on.
You will need to configure the zones you choose in bind. Read the docs if you need help here.
Finally, I'm not so sure about the email being resent because of unreachable recipients. One thing I remember: In the postfix spool, in the defered dir, there were LOT of emails with different hash names. Looking in some of them, I noticed recipients duplicated in several of them, this should not be normal... In the logs, I could see some these hashed defered emails being repeated several times, each time with a cause why the mail was being defered.
If recipients are duplicated in several different messages, then there were multiple injects to the same recipients. Again, as many have requested, logs are necessary to look into this issue.
Feizhou wrote:
Finally, I'm not so sure about the email being resent because of unreachable recipients. One thing I remember: In the postfix spool, in the defered dir, there were LOT of emails with different hash names. Looking in some of them, I noticed recipients duplicated in several of them, this should not be normal... In the logs, I could see some these hashed defered emails being repeated several times, each time with a cause why the mail was being defered.
If recipients are duplicated in several different messages, then there
That's the best explanation I've heard; it also goes to explain why others haven't had the problem. Over that number of recipients I guess there could be some redirections (through aliases) and other duplictes (it's possible to send mail to my mailbox via different domain names).
were multiple injects to the same recipients. Again, as many have requested, logs are necessary to look into this issue.
and as the OP said they've been rotated into the bitbucket, I guess it's best left here, with a note to OP to be suspecious next time, perhap s suspend sending while he checks things out:-)
John Summerfied wrote:
Feizhou wrote:
If recipients are duplicated in several different messages, then there
That's the best explanation I've heard; it also goes to explain why others haven't had the problem. Over that number of recipients I guess there could be some redirections (through aliases) and other duplictes (it's possible to send mail to my mailbox via different domain names).
I have mailing lists with 10,000s of users - if postfix would expose the described behaviour, I would have fled from it screaming about shitty software.
Ralph
Ralph Angenendt wrote:
John Summerfied wrote:
Feizhou wrote:
If recipients are duplicated in several different messages, then there
That's the best explanation I've heard; it also goes to explain why others haven't had the problem. Over that number of recipients I guess there could be some redirections (through aliases) and other duplictes (it's possible to send mail to my mailbox via different domain names).
I have mailing lists with 10,000s of users - if postfix would expose the described behaviour, I would have fled from it screaming about shitty software.
I hope I got the message through that the problem is with phpbb and not with postfix itself. phpbb must have injected multiple messages with the same recipients.
Feizhou wrote:
I hope I got the message through that the problem is with phpbb and not with postfix itself. phpbb must have injected multiple messages with the same recipients.
Yes. I just wanted to state (again) that this is definitely not the fault of postfix.
Ralph
Ralph Angenendt a écrit :
Feizhou wrote:
I hope I got the message through that the problem is with phpbb and not with postfix itself. phpbb must have injected multiple messages with the same recipients.
Yes. I just wanted to state (again) that this is definitely not the fault of postfix.
Ralph
Thanks for all your responses.
Phpbb could of course be the problem. I was wondering how this kind of basic bug could still exist in postfix...
If it is a phpbb bug, is there an easy way to have a fake smtp server, so I can catch the bugged phpbb smtp request? (I can setup another smtp port in phpbb...)
Regards
Sophana
sophana wrote:
Ralph Angenendt a écrit :
Feizhou wrote:
I hope I got the message through that the problem is with phpbb and not with postfix itself. phpbb must have injected multiple messages with the same recipients.
Yes. I just wanted to state (again) that this is definitely not the fault of postfix.
Ralph
Thanks for all your responses.
Phpbb could of course be the problem. I was wondering how this kind of basic bug could still exist in postfix...
If it is a phpbb bug, is there an easy way to have a fake smtp server, so I can catch the bugged phpbb smtp request? (I can setup another smtp port in phpbb...)
Before you blame phpbb, check your data for duplicates.
Also, check that you have a recent version, there was a version a while ago that became very well known for a very bad reason - it was vulnerable to external attack, and external attackers were very successful.
How many computers can you use to try to debug your problem? I don't like to test on production systems, do you?
sophana wrote:
Ralph Angenendt a écrit :
Feizhou wrote:
I hope I got the message through that the problem is with phpbb and not with postfix itself. phpbb must have injected multiple messages with the same recipients.
Yes. I just wanted to state (again) that this is definitely not the fault of postfix.
Ralph
Thanks for all your responses.
Phpbb could of course be the problem. I was wondering how this kind of basic bug could still exist in postfix...
No way. I had a suspicion once that there might have been something like this when I did massive injects on a test to find the best filesystem to use with postfix but upon close scrutiny, there was no such bug. The generated recipients were so close (postal work :D) it made me imagine dups.
If it is a phpbb bug, is there an easy way to have a fake smtp server, so I can catch the bugged phpbb smtp request? (I can setup another smtp port in phpbb...)
smtp-sink is a program that comes with postfix that sends all mails it receives to Dave Null. You can tell it to log smtp commands to syslog or display smtp conversations on screen or redirect them to a file. man smtp-sink for more information. You can also run this on any port on localhost as you fancy to test out another instance of your current phpbb.
Feizhou a écrit :
If it is a phpbb bug, is there an easy way to have a fake smtp server, so I can catch the bugged phpbb smtp request? (I can setup another smtp port in phpbb...)
smtp-sink is a program that comes with postfix that sends all mails it receives to Dave Null. You can tell it to log smtp commands to syslog or display smtp conversations on screen or redirect them to a file. man smtp-sink for more information. You can also run this on any port on localhost as you fancy to test out another instance of your current phpbb.
Thanks for your suggestion. I tried smtp-sink on another port and found that phpbb was the cause of the problem. phpbb did several connections to the smtp-sink and sent multiple emails with more than 50000 recipients instead of 3500. When I sorted the smtp-sink output, I could see recipients duplicated multiple times.
Conclusion postfix works great! phpbb sucks!
Thank you all for your support.
I tried smtp-sink on another port and found that phpbb was the cause of the problem. phpbb did several connections to the smtp-sink and sent multiple emails with more than 50000 recipients instead of 3500. When I sorted the smtp-sink output, I could see recipients duplicated multiple times.
Conclusion postfix works great! phpbb sucks!
Now I wonder what is wrong with phpbb...is yours modified at all?
Feizhou a écrit :
I tried smtp-sink on another port and found that phpbb was the cause of the problem. phpbb did several connections to the smtp-sink and sent multiple emails with more than 50000 recipients instead of 3500. When I sorted the smtp-sink output, I could see recipients duplicated multiple times.
Conclusion postfix works great! phpbb sucks!
Now I wonder what is wrong with phpbb...is yours modified at all?
It might be related to browser retrying the request several times because it is a long task don't you think? This could explain why there are several smtp connections. Anyway I reported the bug to phpbb bug tracking.
Yes the phpbb is modded, but not by me. I'm just hosting it on my dedicated server.
sophana wrote:
Hi
I'm hosting a forum (phpbb) with more than 3500 users with centos 4.3. The forum admins would like to send an email to its (3500) users. I configured phpbb to use smtp (postfix-2.1.5-4.2.RHEL4.mysql.centos4) to send the email. I raised the max destination limit from 1000 to 5000, and made a test. Here the disaster began: Majority of the users received from 20 to 50 emails. Looking at the logs, postfix divided the email into several ones with 50 recipient each (as documented). I think the problem was that when one of the recipients was unreachable, the mail went into defered state, to be retried later. The problem (bug?) is that the mail was sent again to all 50 recipients, explaining why so many users received lot of mails. Anyone had a similar problem? Any suggestion for that?
That should not happen. Can you put together a test-case to prove the problem so you can provide a bug-report with a reproducible problem.
I use postfix, but if I had that problem and could nail it onto Postfix, I'd likely switch immediately to something else. Sendmail, probably.
OTOH I'd be using mailman for regular mailings. Whether that would creat a problem with postfix I don't know, but surely the combination is fairly common.