ML library for extracting scholarly texts
Questions tagged [grobid]
17 questions
3
votes
2 answers
Installing JVM 8 on Mac with M1 chip
I'm trying to install GrObId and it requires JVM8. Will installing JDK work for this on an M1 Mac?

rami_salazar
- 165
- 1
- 7
3
votes
2 answers
Parse Grobid .tei.xml output with Beautiful Soup
I am trying to use Beautiful Soup to extract elements from a .tei.xml file that was generated using Grobid.
I can get title(s) using:
titles = soup.findAll('title')
What is the correct syntax to access the 'lower level' elements? (Author /…

Dave Ja Vu
- 43
- 3
3
votes
3 answers
Maven- Could not resolve dependency
I am a beginner in using Maven.. I tried to add Grobid (for pdf parsing) in maven. The dependency I gave is :
org.grobid
grobid-core
0.3.4
But on…

Naima
- 101
- 1
- 4
- 9
2
votes
1 answer
Parsing TEI-XML with beautiful soup
I am trying to parse metadata from a GROBID output (parsing academic papers in PDF format). The references look like this
The raw TEI-XML file looks like this (read via soup = read_tei('paper1.tei.xml'))

keeran_q789
- 81
- 5
2
votes
2 answers
Running Grobid on Windows 64 bit
I am trying to execute GROBID on a 64-bit Windows.
There is no x64 bit version of the library for Windows (Atleast I could not find). It runs on 64 bit Linux with 64 bit JRE and on 32 bit Windows with 32 bit JRE. So the version of JRE is not the…

JHS
- 7,761
- 2
- 29
- 53
1
vote
1 answer
Tika with Grobid throwing error when parsing pdf document
I am trying to extract both document metadata and journal header metadata from a pdf document. I verified that Tika Server (v1.21 / v1.24) and Grobid (v0.6.0) are independently able to extract metadata from the pdf document. However, when I run…

Subramanyam Avlur
- 11
- 1
1
vote
1 answer
I want tot install maven for using a software call GROBID
I tried to install maven onm win 10 based on
https://maven.apache.org/install.html
for installation of software GROBID_NER
https://grobid-ner.readthedocs.io/en/latest/build-and-install/
BUT UNFORTUNATELY I faced with this error, can anyone tell me…

DevML
- 320
- 1
- 3
- 15
1
vote
1 answer
Getting error as below in using Gradle in Grobid
I am trying to install Grobid in Ubuntu 64 bit.
Referred from
https://grobid.readthedocs.io/en/latest/Install-Grobid/
$/GROBID_LATEST_0.5.1/grobid-0.5.1$ ./gradlew clean install
FAILURE: Build failed with an exception.
* What went…

rajeshkumargp
- 105
- 1
- 11
1
vote
2 answers
Convert Grobid curl command to requests in Python
I'm trying to convert curl script to parse pdf file from grobid server to requests in Python.
Basically, if I run the grobid server as follows,
./gradlew run
I can use the following curl to get the output of parsed XML of an academic paper…

titipata
- 5,321
- 3
- 35
- 59
1
vote
1 answer
Grobid returning 500 type error
I am trying to use Grobid which is built in my local machine, but this script is print a 500 error. Whereas it works fine when I do it from the CLI using Curl. Help please!
import requests
url =…

fatah
- 573
- 3
- 9
- 16
1
vote
1 answer
Edit python configuration file to contain Grobid path
I have installed Robot-reviewer from github following all the passages. I'm at one of the last passages where it asks
"Edit the robotreviewer/config.py file to contain the path to the directory where you have installed Grobid. (RobotReviewer will…

GGA
- 385
- 5
- 22
1
vote
1 answer
Integrating grobid with tika and solr
I'm using Solr to index journal articles. Using the out-of-the-box configuration, it indexed the text of the documents, but I'm looking to use Grobid to pull out the authors, title, affiliations, etc. I got grobid up and running as a service.
I…

betseyb
- 1,302
- 2
- 18
- 37
0
votes
1 answer
GROBID Python client in Docker: How do I resolve './config.json' not found error?
I am trying to use GROBID installed via Docker to convert PDFs into XML files. I am using the guide from https://github.com/kermitt2/grobid_client_python.
After writing this code in the WSL command prompt (with my proper directories):
grobid_client…

MDutta
- 1
0
votes
1 answer
Disable access logs for GROBID
I'm using GROBID as a Docker container. The default logging config is way to noisy for large scale production use, so I build a custom image based on the 0.7.2 version with just the grobid.yaml file replaced. The logging section of that file looks…

Achim
- 15,415
- 15
- 80
- 144
0
votes
0 answers
Reaching a static folder in a web app with maven & tomcat
As a newbie to developing web applications, I am developing an app (java 8, maven, tomcat, windows 10) using grobid.
To be able to use the grobid resources, I need to reach grobid-home folder (see the capture that it is under resources folder) as…

mlee_jordan
- 772
- 4
- 18
- 50