16

I'm currently developing an application that would Copy/Transfer a sentence/paragraph from a PDF file to my program. I'm using Javascript to develop my program but I have not found any idea how to read a PDF file.

I want to know how to Copy/Transfer a sentence/paragraph from a PDF file to my program?

Thanks.

Christian Eric Paran
  • 980
  • 6
  • 27
  • 49

2 Answers2

16

I know that the question is old, but if you find PDF.js too complex for the job, npm install pdfreader. (I wrote that module)

It would take 5 lines of code to extract text from your PDF file:

var PdfReader = require("pdfreader").PdfReader;
new PdfReader().parseFileItems("sample.pdf", function(err, item){
  if (item && item.text)
    console.log(item.text);
});
Adrien Joly
  • 5,056
  • 4
  • 28
  • 43
  • 1
    'It does not work from a web browser.' Guessing this means I couldn't use browserify with it? – static_null Sep 10 '18 at 03:23
  • I don't know, @static_null. Let us know how it goes if you give it a try! – Adrien Joly Sep 11 '18 at 08:30
  • @AdrienJoly hi do you know where/how I can get the compiled version of pdfreader module? I'm hoping to get one .js file so I could use it as a library. Thank you! – Daj Jun 17 '20 at 02:45
  • @Daj I am not distributing any bundled version of pdfreader. Feel free to use the bundler of your choice (e.g. webpack or other) to achieve that. – Adrien Joly Jun 18 '20 at 07:42
7

Check out PDF.js, it's a commonly used JavaScript library that contains a lot of methods for PDF manipulation.

Check out this answer to see a demonstration of how to extract text using pdf.js.

Community
  • 1
  • 1
theonlygusti
  • 11,032
  • 11
  • 64
  • 119