0

I am writing a custom filter in Solr to post a token to Apache Stanbol for enhancement and index the response to a different field in the same document.

In my test code below I have got the Stanbol response and have added it as a new document to Solr. My requirement is to add the stanbolResponse as a field value to the same document being indexed. I think this can be done if I can retrieve the document Id from the TokenStream in the filter.

Can anyone please help me with a sample code/example or a link on how to achieve this?

public boolean incrementToken() throws IOException {
    if (!input.incrementToken()) {
      return false;
    }

    int length = charTermAttr.length();
    char[] buffer = charTermAttr.buffer();
    String content = new String(buffer);
    Client client = Client.create();
    WebResource webResource = client.resource(stanbol_endpoint + "enhancer");
    ClientResponse response = webResource
        .type(MediaType.TEXT_PLAIN)
        .accept(new MediaType("application", "rdf+xml"))
        .entity(content2,MediaType.TEXT_PLAIN)
        .post(ClientResponse.class);

    int status = response.getStatus();
    if (status != 200 && status != 201 && status != 202) {
        throw new RuntimeException("Failed : HTTP error code : "
             + response.getStatus());
    }

    String output = response.getEntity(String.class);
    charTermAttr.setEmpty();
    char[] newBuffer = output.toCharArray();
    charTermAttr.copyBuffer(newBuffer, 0, newBuffer.length);

    SolrInputDocument doc1 = new SolrInputDocument();
    doc1.addField( "id", "id1", 1.0f );
    doc1.addField("stanbolResponse", output);
    try {
        server.add(doc1);
        server.commit();
    } catch (SolrServerException e) {
        System.out.println("error while indexing response to solr");
        e.printStackTrace();
    }
    return true;
}
cheffe
  • 9,345
  • 2
  • 46
  • 57
Dileepa Jayakody
  • 535
  • 1
  • 6
  • 19
  • Perhaps I misunderstand your case, but can't you simply create an UpdateRequestProcessor prior analysis? You can do whatever you want in your processor, then add the result to the document and pass it through normal analysis chain. – lexk Nov 13 '13 at 19:42
  • yep Lexk, this usecase was succesfully covered by writing a UpdateRequestProcessor. – Dileepa Jayakody Nov 20 '13 at 06:19

1 Answers1

0

This usecase was successfuly covered by writing a custom UpdateRequestProcessor and configuring the /update request handler to use my custom processor in the update.chain.

I was able to process and add new fields to the document prior to indexing. Below is how I configured my /update request handler with my custom processor.

RequestProcessor for stanbol process:

<updateRequestProcessorChain name="stanbolInterceptor">
    <processor class="com.solr.stanbol.processor.StanbolContentProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

configure the request-handler with above chain for update.chain:

<requestHandler name="/update" class="solr.UpdateRequestHandler">
       <lst name="defaults">
         <str name="update.chain">stanbolInterceptor</str>
       </lst>
</requestHandler>
Dileepa Jayakody
  • 535
  • 1
  • 6
  • 19