1

enter image description hereIn ColdFusion 2016, I am trying to index a large file contains more than 100000 characters.

I getting the below error

org.apache.tika.sax.WriteOutContentHandler$WriteLimitReachedException: Your document contained more than 100000 characters, and so your requested limit has been reached. To receive the full text of the document, increase your limit. (Text up to the limit is however available).

Where do I increase the limit from 100000 to 10000000 ?

 <cfindex 
        action="update" 
        type="file" 
        collection="#collection_name#" 
        recurse="yes" 
        key="#key#" 
        urlpath="#urlpath#" 
        status = "insert"/>
 <cfdump var="#insert#">
  • 1
    Can you share the code for Apache Tika implementation? – rrk May 10 '19 at 11:23
  • There are two tika jar files in the cfusion\lib folder. tika-core.jar and tika-parsers.jar . – Rojin Terrance May 10 '19 at 11:42
  • 1
    How is it connected to CF admin collections? I am more familiar with Solr. – rrk May 10 '19 at 11:45
  • @RojinTerrance Have you tried to increase the heap size in CFIDE administrator? In CFIDE, increase the maximum JVM heap size under the Server Settings ->Java and JVM – Sathish Chelladurai May 10 '19 at 16:09
  • 1
    I'm afraid you are out of luck here. The buffer used to parse documents via Tika (`org.apache.tika.sax.BodyContentHandler`) is hardcoded within `SolrUtils` in ACF. Your only option is to speak to Solr directly (without using `cfindex`). – Alex May 10 '19 at 18:47
  • @Alex - You should post that as an answer. – SOS May 24 '19 at 01:05

0 Answers0