I'm wondering if there are command tools like antiword and docx2txt for Microsoft PowerPoint files (.ppt and .pptx). The idea is to extract text from PowerPoint files. Sorry this isn't exactly about CentOS, but I'd really like it if Yum has something. I tried xlhtml, but it hasn't been updated in a while and isn't exactly wanting to work on CentOS 5.
JohnStanley Writes:
If you pretty slick at Python I know for fact there is a python rtf (ritch text format) library to extract rtf. So if you look hard enough there is probally one on the net that someone has wrote. Google even has a RTF Library for Python. As a side note .Net offers Office Tools to do that very thing you want in .Net