Questions tagged [pdf2json]

15 questions
3
votes
2 answers

Why do pdf parsing libraries pdf2json and pdf-parse seem to not work with Next JS app router?

I've been trying to implement pdf parsing logic in my Next JS app. It seems the libraries pdf2json and pdf-parse don't work with the new Next JS app router. Steps to reproduce: Run npx create-next-app@latest and follow the prompts, and say Yes to…
Andrew Luo
  • 31
  • 1
3
votes
1 answer

Calculating the length of text using fontsize (npm - pdf2json library)

I am using the pdf2json library to parse a pdf. It is returning the parsed data in a json and I've attached some sample data. The main variable to keep note of are Height - The height of the pdf in PAGE_UNITS Width - The width of the pdf in…
AmmarMZ
  • 45
  • 2
  • 5
2
votes
1 answer

SyntaxError: Unexpected token '(' when trying to run pdf2json

When I try to run pdf2json (without any parameters at all) I'm getting this error: /usr/lib/node_modules/pdf2json/lib/p2jcmd.js:63 #continue(callback, err) { ^ SyntaxError: Unexpected token '(' at wrapSafe…
neubert
  • 15,947
  • 24
  • 120
  • 212
1
vote
2 answers

Errors when try to use pdf2json with typescript

when i try to import pdf2json (3.0.1) in my node project (typescript) iam getting this error Could not find a declaration file for module 'pdf2json' Also i try to install @types/pdf2json for typescript and there is not available. How i can solve…
Javier
  • 31
  • 4
1
vote
2 answers

How can I get PDF dimensions in pixels in Node.js?

I tried pdf2json: const PDFParser = require("pdf2json"); let pdfParser = new PDFParser(); pdfParser.loadPDF("./30x40.pdf"); // ex: ./abc.pdf pdfParser.on("pdfParser_dataReady", pdfData => { width = pdfData.formImage.Width; // pdf width height…
DINO
  • 191
  • 1
  • 12
1
vote
1 answer

How to integrate tabula-js in an angular 9 app?Is there any other way to select specific parts from the rendered pdf and extract the data in json?

I tried installing the tabula-js library but since it's a js lib I don't know how to integrate it in angular which works on ts. Also if not this, then is there any way to select specific parts from a rendered pdf document by coordinates and then use…
1
vote
1 answer

How to parse a PDF in nodejs

I am trying to parse a pdf and categorize information based on text formatting/decoration. How do you suggest I do that? For example, I have a pdf in which the structure is repeated: S.No. BOLD+UNDERLINED TITLE para How do I categorize this data…
Akshay Kumar
  • 875
  • 13
  • 29
1
vote
1 answer

read pdf in azure function using pdf2json

I successfully implemented pdf2json to fetch and read pdf from url using node. However, Azure function is an async function and finishes execution before pdfPipe.on("pdfParser_dataReady", pdf => {}) is executed. My implementation is as follows var…
Mayank Kumar Chaudhari
  • 16,027
  • 10
  • 55
  • 122
0
votes
2 answers

Is there a way to generate a PDF file out of a JSON file?

I'm using pdf2json (https://github.com/modesty/pdf2json) to convert some PDF files to JSON. I am modifying the JSON files (the ones generated by pdf2json) and then I would like to obtain the corresponding PDF files using the modified JSON files. Is…
0xracer
  • 9
  • 3
0
votes
0 answers

show error pdf2json is not defined when i add import in laravel

How can I extract data from PDF? But it already show error in console pdf2json.js is not defined How can i get data from pdf and show on webpage? But its not working import pdf2json from "pdf2json"; add in app.js file 2)use reader.onload =…
0
votes
0 answers

Extracting text from PDF file in React Native

I am trying to add a component that allows the user to upload a PDF file they have saved on their device, and then extract the text from it into the app. I have found many duplicates of this question, but none seem to have really been…
0
votes
1 answer

nodejs pdf parse getting value after specific string

my goal is to get a certain string after a predefined text. In this case i would like to read the following value: I found out this is possible using regex, therefore i tried this: const fs = require("fs"); const PDFParser =…
Dominik Hartl
  • 105
  • 1
  • 2
  • 8
0
votes
2 answers

How to handle files received from the frontend synchronously API on node

At first, I apologize for my terrible English :D Hello, I have the following situation that is leaving me intrigued, I have a frontend made in react and a backend in node that receives requests by express. The idea is that from the frontend I send a…
0
votes
0 answers

How to use async await for events in pdf2json(pdfParser)

I am using https://www.npmjs.com/package/pdf2json npm package which will pick the pdf from the given path and when the pdf parser is ready to parse it, then it triggers an event pdfParser_dataReady. I want to user this along with async await. const…
Rajeshwar
  • 2,290
  • 4
  • 31
  • 41