I am trying to use Tika in python to parse PDF files. I am using python 2.7 and a Mac. I cannot get it to work. I have installed it, then:
from tika import parser
raw = parser.from_file('...file')
I get this error (edited for brevity):
Retrieving http://search.maven.org/remotecontent ... to /var/folders/... [MainThread ] [INFO ] Retrieving http:// ... [MainThread ] [WARNI] Failed to see startup log message; retrying...
...
2019-04-08 14:53:05,910 [MainThread ] [ERROR] Tika startup log message not received after 3 tries.
2019-04-08 14:53:05,916 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer.
My question is very similar to that here Use tika with python, runtimeerror: unable to start tika server. The top answer, though, doesn't work for me. I have installed Java 8, but it still doesn't work. What should I do?