[CentOS] OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
Rajagopal Swaminathan
raju.rajsand at gmail.comSun Aug 30 03:36:44 UTC 2009
- Previous message: [CentOS] OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
- Next message: [CentOS] OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Greetings, On Fri, Aug 28, 2009 at 10:50 PM, Les Mikesell<lesmikesell at gmail.com> wrote: > Does anyone have experience with linux tools to parse the text from > common non-text file formats for searching? I'm trying to use the > kinosearch add-on for twiki which is fine as far as the search goes, but > it takes forever to generate the index. I am not sure this answers your query to the point. But I have seen Lucene .net SDK (With extensions to scour .doc, .odt, .pdf etc.) to very good effect and pretty decent performance. HTH Thanks and Regards Rajagopal
- Previous message: [CentOS] OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
- Next message: [CentOS] OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the CentOS mailing list