Questions tagged [grobid]

ML library for extracting scholarly texts

See https://github.com/kermitt2/grobid

17 questions
3
votes
2 answers

Installing JVM 8 on Mac with M1 chip

I'm trying to install GrObId and it requires JVM8. Will installing JDK work for this on an M1 Mac?
rami_salazar
  • 165
  • 1
  • 7
3
votes
2 answers

Parse Grobid .tei.xml output with Beautiful Soup

I am trying to use Beautiful Soup to extract elements from a .tei.xml file that was generated using Grobid. I can get title(s) using: titles = soup.findAll('title') What is the correct syntax to access the 'lower level' elements? (Author /…
Dave Ja Vu
  • 43
  • 3
3
votes
3 answers

Maven- Could not resolve dependency

I am a beginner in using Maven.. I tried to add Grobid (for pdf parsing) in maven. The dependency I gave is : org.grobid grobid-core 0.3.4 But on…
Naima
  • 101
  • 1
  • 4
  • 9
2
votes
1 answer

Parsing TEI-XML with beautiful soup

I am trying to parse metadata from a GROBID output (parsing academic papers in PDF format). The references look like this The raw TEI-XML file looks like this (read via soup = read_tei('paper1.tei.xml'))
2
votes
2 answers

Running Grobid on Windows 64 bit

I am trying to execute GROBID on a 64-bit Windows. There is no x64 bit version of the library for Windows (Atleast I could not find). It runs on 64 bit Linux with 64 bit JRE and on 32 bit Windows with 32 bit JRE. So the version of JRE is not the…
JHS
  • 7,761
  • 2
  • 29
  • 53
1
vote
1 answer

Tika with Grobid throwing error when parsing pdf document

I am trying to extract both document metadata and journal header metadata from a pdf document. I verified that Tika Server (v1.21 / v1.24) and Grobid (v0.6.0) are independently able to extract metadata from the pdf document. However, when I run…
1
vote
1 answer

I want tot install maven for using a software call GROBID

I tried to install maven onm win 10 based on https://maven.apache.org/install.html for installation of software GROBID_NER https://grobid-ner.readthedocs.io/en/latest/build-and-install/ BUT UNFORTUNATELY I faced with this error, can anyone tell me…
DevML
  • 320
  • 1
  • 3
  • 15
1
vote
1 answer

Getting error as below in using Gradle in Grobid

I am trying to install Grobid in Ubuntu 64 bit. Referred from https://grobid.readthedocs.io/en/latest/Install-Grobid/ $/GROBID_LATEST_0.5.1/grobid-0.5.1$ ./gradlew clean install FAILURE: Build failed with an exception. * What went…
rajeshkumargp
  • 105
  • 1
  • 11
1
vote
2 answers

Convert Grobid curl command to requests in Python

I'm trying to convert curl script to parse pdf file from grobid server to requests in Python. Basically, if I run the grobid server as follows, ./gradlew run I can use the following curl to get the output of parsed XML of an academic paper…
titipata
  • 5,321
  • 3
  • 35
  • 59
1
vote
1 answer

Grobid returning 500 type error

I am trying to use Grobid which is built in my local machine, but this script is print a 500 error. Whereas it works fine when I do it from the CLI using Curl. Help please! import requests url =…
fatah
  • 573
  • 3
  • 9
  • 16
1
vote
1 answer

Edit python configuration file to contain Grobid path

I have installed Robot-reviewer from github following all the passages. I'm at one of the last passages where it asks "Edit the robotreviewer/config.py file to contain the path to the directory where you have installed Grobid. (RobotReviewer will…
GGA
  • 385
  • 5
  • 22
1
vote
1 answer

Integrating grobid with tika and solr

I'm using Solr to index journal articles. Using the out-of-the-box configuration, it indexed the text of the documents, but I'm looking to use Grobid to pull out the authors, title, affiliations, etc. I got grobid up and running as a service. I…
betseyb
  • 1,302
  • 2
  • 18
  • 37
0
votes
1 answer

GROBID Python client in Docker: How do I resolve './config.json' not found error?

I am trying to use GROBID installed via Docker to convert PDFs into XML files. I am using the guide from https://github.com/kermitt2/grobid_client_python. After writing this code in the WSL command prompt (with my proper directories): grobid_client…
MDutta
  • 1
0
votes
1 answer

Disable access logs for GROBID

I'm using GROBID as a Docker container. The default logging config is way to noisy for large scale production use, so I build a custom image based on the 0.7.2 version with just the grobid.yaml file replaced. The logging section of that file looks…
Achim
  • 15,415
  • 15
  • 80
  • 144
0
votes
0 answers

Reaching a static folder in a web app with maven & tomcat

As a newbie to developing web applications, I am developing an app (java 8, maven, tomcat, windows 10) using grobid. To be able to use the grobid resources, I need to reach grobid-home folder (see the capture that it is under resources folder) as…
mlee_jordan
  • 772
  • 4
  • 18
  • 50
1
2