2

In the newestTika:2.5 default OCR timeout is 300 - not enough if multiple parallel processed documents or images doing OCR which leads to Tika OCR timeouts and so Tika exception for full document.

I've tried add X-Tika-Timeout-Millis header but it cannot be extended more than it is set on server. How can I increase it?

Kate
  • 33
  • 2

1 Answers1

1

try to run tika with this param in tika-config.xml or like so

 <server>
    <params>
      <!-- maximum time to allow per parse before shutting down and restarting
          the forked parser. Not allowed if nofork=true. -->
      <taskTimeoutMillis>4000000</taskTimeoutMillis>
    </params>
  </server>