On 06/28/2012 12:15 PM, Gordon Messmer wrote:
You have two major performance problems in this script. First, UTF-8 processing is slow. Second, wildcards are EXTREMELY SLOW!
Naturally, you should test both on your own data. I'm amused to admit that I tested my own advice against my mail log and got more improvement from the LANG setting than the string prefix. The combination of the two reduced the amount of time to run your your pattern against my mail logs by about 90%.