How to configure Apache Tika with apache Solr 1.4.1

Question

I want to index a large number of pdf documents.

I have found a reference showing that it could be done using Apache Tika but unfortunately I cannot find any reference that describes I could configure Apache Tika in Solr 1.4.1.

Once configured I do have it configured, how can I send documents to Solr directly without using curl?

I am using solrnet for indexing.

score 5 · Accepted Answer · answered Oct 05 '10 at 13:12

5

See ExtractingRequestHandler

answered Oct 05 '10 at 13:12

Pascal Dimassimo

6,908
1
37
34

score 3 · Answer 2 · answered Oct 05 '10 at 14:08

3

Support for ExtractingRequestHandler in SolrNet is not yet complete. You can either finish implementing it, or work around it and craft your own HttpWebRequests.

answered Oct 05 '10 at 14:08

Mauricio Scheffer

98,863
23
192
275

How to configure Apache Tika with apache Solr 1.4.1

2 Answers2