Hi, My grep regex foo is not very good and googling is getting me nowhere so hopefully someone is kind enough to give me some pointers. Goal: grep (non .dbg) filenames and versions from a ftp dir listing and a raw html file: $ wget --no-remove-listing -O ftp-index.txt ftp://127.0.0.1/test/ $ wget --no-remove-listing -O index.html http://127.0.0.1/test/ The relevant parts of the files above (first one is ftp listing, second part is the html file, both copied to test_regex.txt) are: 2011 Jan 28 21:25 File <a href="ftp://127.0.0.1/bar-4.5.6.i686.dbg.tgz">bar-4.5.6.i686.dbg.tgz</a> (5551274 bytes) 2011 Jan 28 21:25 File <a href="ftp://127.0.0.1/bar-4.5.6.i686.tgz">bar-4.5.6.i686.tgz</a> (5551274 bytes) 2011 Jan 28 21:25 File <a href="ftp://127.0.0.1/bar-4.5.6.x86_64.dbg.tgz">bar-4.5.6.x86_64.dbg.tgz</a> (5551274 bytes) 2011 Jan 28 21:25 File <a href="ftp://127.0.0.1/bar-4.5.6.x86_64.tgz">bar-4.5.6.x86_64.tgz</a> (5551274 bytes) <tr><td><a href="foo-bar-1.2.3+1.2.3.tar.gz">foo-bar-1.2.3+1.2.3.tar.gz</td></tr> This is what I now have (improvements most welcome): $ egrep -o ">([A-Za-z_-]+)([[:digit:]]{1,3}(\.[[:digit:]]{1,3})*).+(.|t)gz" ./test_regex.txt | grep -v ".dbg" | tr -d '>' Output: foo-bar-1.2.3+1.2.3.tar.gz baz-4.5.6.i686.tgz baz-4.5.6.x86_64.tgz So far so good but now I also want to get the version numbers which I can't figure out. Anyone have a pointer how to get the version number from these filenames (1.2.3+1.2.3 and 4.5.6)? Thanks! Patrick