Thu Dec 30 16:01:24 UTC 2010
John Doe <jdmls at yahoo.com>

From: ken <gebser at mousecar.com>

> Some file this script would need to process could very well  be
> ridiculously huge, which is why I chose to process  line-by-line.
> Secondly, yes, I was already using the general strategy of  taking out
> the newlines (where they're misplaced) and then putting them back  in
> (where they should be).  It was probably difficult to discern that  just
> from the code.
> Thanks for your reply, but it doesn't really  address the  problem.

Not really an answer but why not use an html beautifier...?

$ cat $FILE | tr "\n" " " | sed 's/ *></>\n</g'
<TITLE >We've Lied to You…</TITLE>
<META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79">
<LINK REL="HOME" TITLE="Maximum RPM" HREF="index.html">
<LINK REL="UP" TITLE="Using RPM to Verify Installed Packages"