From: ken <gebser at mousecar.com> > Some file this script would need to process could very well be > ridiculously huge, which is why I chose to process line-by-line. > > Secondly, yes, I was already using the general strategy of taking out > the newlines (where they're misplaced) and then putting them back in > (where they should be). It was probably difficult to discern that just > from the code. > > Thanks for your reply, but it doesn't really address the problem. Not really an answer but why not use an html beautifier...? http://www.w3.org/People/Raggett/tidy/ $ cat $FILE | tr "\n" " " | sed 's/ *></>\n</g' <HTML> <HEAD> <TITLE >We've Lied to You…</TITLE> <META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"> <LINK REL="HOME" TITLE="Maximum RPM" HREF="index.html"> <LINK REL="UP" TITLE="Using RPM to Verify Installed Packages" HREF="ch-rpm-verify.html"> <LINK... JD