I am developing a full text search engine for indexing popular binary formats. I know that there are hundereds of such questions (and solutions) already, but I found it tough to find one:
- cross platform
- supports DOC, DOCX and PDF formats at once
- easy to use with python
- can be set up in a major shared host