I'm trying to convert curl
script to parse pdf file from grobid
server to requests
in Python.
Basically, if I run the grobid
server as follows,
./gradlew run
I can use the following curl
to get the output of parsed XML of an academic paper example.pdf
as below
curl -v --form input=@example.pdf localhost:8070/api/processHeaderDocument
However, I don't know the way to convert this script into Python. Here is my attempt to use requests
:
GROBID_URL = 'http://localhost:8070'
url = '%s/processHeaderDocument' % GROBID_URL
pdf = 'example.pdf'
xml = requests.post(url, files=[pdf]).text