On Tue, 2006-04-11 at 06:55 -0700, Mike Stankovic wrote: > I've got about 10,000 docs I'd like to devise a > search/index for. I found a perl script called > Perlfect that can do that on an old P3 but at the > astronomical time of 7 hours. Another script(cgi/perl) > at hotscripts can do the same but allows the "rm -rf > /" exploit. DoH!? > > Is there anything perl/flatfile that can search/index > faster? This is a nice job for an aging P3 in the > corner so php/MySQL is not an option. Don't suggest > beagle/windows solutions as this is a CentOS 4.3 system. Well at work we have an archive of ~ 12K PDFs that engineering uses for process documentations and I use Swish-e (http://swish-e.org/) to index it so that they can search it. The server it sits on is a PIII 733 with 512MB RAM and it takes about 90 minutes to re-index them every night. It works well for us as it allows AND & OR operators, searches for phrases and other fairly advanced features. The main limitation is that you need a filter to convert whatever the document is to one of the following: text, html or xml so it can be indexed. Regards, Paul Berger > __________________________________________________ > Improve the mailing list by performing a simple search > before posting and reading the faq/etiquette. > Thank you!! > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >