On 9/10/21 12:26 pm, Rob Kampen wrote:
So, after many dozens of hours and sending test emails I have found a solution (work around) that appears to work okay. It is now different to the original two MX servers I cloned from, in that the maillog shows a different cycle of processing, and it now fails a truly unknown mailbox much later in the process - thus higher workload on my MX. But the key thing is that it does now do the virtual_alias checks on incoming emails on port 25 before rejecting.
if your MX is not rejecting messages to invalid recipients right away but instead bounces the messages later on you become a backscatter source (See https://www.backscatterer.org/?target=bounces).
your server needs a properly configured list of valid recipients so it knows right away what recipients to accept and which ones to reject.
No idea why this third MX is behaving differently. It has a dual stack IP, so I disabled IPv6 access and tried again, but that certainly wasn't the cause of the difference in processing.
If you can provide the output of the following two commands it would be very helpful in troubleshooting your problem:
postconf -nf postconf -Mf
Also of great help would relevant logs for one message that is giving you issues. These should be in /var/log/maillog and contain a connection line followed by a number of postfix/smtpd lines, please copy all the logs for *one* message. Please do not attempt to enable verbose logging (it only adds in a lot of extra unneeded info that detracts from finding the real problem) and it is unnecessary to provide log lines from non-postfix processes.
It should be noted that the two initial MX machines have an extra line in the maillog that is the second logged step in the process, and goes something like:
Oct 8 19:00:58 mx policyd-spf[16055]: prepend Received-SPF: None (mailfrom) identity=mailfrom; client-ip=209.85.210.180; helo=mail-pf1-f180.google.com; envelope-from=rob@example.com; receiver=<UNKNOWN>
This is likely unrelated to the issue but may point to another issue having to do with a possibly incorrect policyd setup. We can cross that bridge after we've fixed the primary issue though (one issue at a time).
After that processing steps are identical.
It's likely that there may be something else subtle in the logs that we can spot that you are not noticing.
Peter