5

I am using "pdf-text" module for Node.js to convert a pdf into a string array and then get specific elements out of it. But the problem is, I can only access the data, "chunks", only when I am inside the callback. I want to store it in some global variable so that I can use it in different files. I have tried storing the elements of the array inside variables while inside the function, but no luck. Here's the code:

var pdfText = require('pdf-text');

var pathToPdf = "PDF FILE NAME";

var fs = require('fs');
var buffer = fs.readFileSync(pathToPdf);

var output;

pdfText(buffer, function(err, chunks){

    if (err){
      console.dir(err);
      return;
    }
    console.dir(chunks);
    output = chunks;
}

console.dir(output);

P.S. I am fairly new to Node.js and JavaScript and help would be appreciated greatly.

Henke
  • 4,445
  • 3
  • 31
  • 44
betelguese123
  • 69
  • 1
  • 2
  • 8
  • 1
    pdfText could be some kind of asynchronous method. In that case console.dir(output) will be called before console.dir(chunks). – Jeroen Heier Jun 01 '17 at 03:59
  • Yea. I figured that out. But I want a way to get data chunks out of pdfText, if it is possible. Thank-you though! – betelguese123 Jun 01 '17 at 04:02
  • 2
    you actually don't want to get the data _out_ of a callback; you want to get your data-needing action _into_ the callback! it's an adjustment at first, but it works well for complex async. – dandavis Jun 01 '17 at 04:23
  • Does this answer your question? [How do I return the response from an asynchronous call?](https://stackoverflow.com/questions/14220321/how-do-i-return-the-response-from-an-asynchronous-call) – Henke Feb 26 '21 at 10:31

3 Answers3

3

The output variable will only be set with "chunks" contents when the callback is called.

Btw, you need to add ");" after the callback function declaration on the pdfText function call.

var pdfText = require('pdf-text');

var pathToPdf = "PDF FILE NAME";
var fs = require('fs');
var buffer = fs.readFileSync(pathToPdf);

var output;

pdfText(buffer, function(err, chunks){

    if (err){
      console.log(err);
      return;
    }
    otherFunction(); // undefined
    output = chunks;
    otherFunction(); // chunks content

});

function otherFunction() {
  console.log(output);
}

console.log(output); // undefined

About js callbacks: https://www.tutorialspoint.com/nodejs/nodejs_callbacks_concept.htm

gnuns
  • 596
  • 5
  • 12
2

But the problem is, I can only access the data, "chunks", only when I am inside the callback.

Yes, that is correct. You can't access the data before it is available, and when it becomes available, your callback gets called with the data.

I want to store it in some global variable so that I can use it in different files.

Suppose you did this. Now you have a problem. Your code in those different files: how will it know when the data is ready? It won't.

You need some way to tell that code the data is ready. The way you tell that code is by calling a function. And at that point you don't need global variables: when you call the function in that other file, you pass the data to it as a function parameter.

In other words, don't just have global code in some file that expects to be able to use your chunks data by referencing a global variable. Instead, write a function that you can call from your callback, and pass chunks into that function.

Michael Geary
  • 28,450
  • 9
  • 65
  • 75
  • Thank-you helps a lot! Just a quick follow up, if I want to display the chunks data through .html file how would I go about that? Does all of that code code go inside the function I'm calling inside the callback or is there a better way to do it? – betelguese123 Jun 01 '17 at 04:28
  • Yes, exactly right. You can structure the code however you want. At one extreme, _all_ the code that uses your `chunks` data could be right there inside the original callback function. Or you can break any part of that code out into a separate function can call that. And that bit of code could call yet another function, perhaps for some repetitive task. But in all cases, the code needs to be in a function that is called directly or indirectly from the callback. – Michael Geary Jun 01 '17 at 04:30
1

If you are using node 8, I believe you can use the async-await feature. So you can refactor your code so that it looks like the following:

var pdfText = require('pdf-text');

var pathToPdf = "PDF FILE NAME";

var fs = require('fs');
var buffer = fs.readFileSync(pathToPdf);

var output;

async function getPDF(buffer) {
  pdfText(buffer, function(err, chunks){
    if (err){
      console.dir(err);
      return;
    }
    return await chunks;
  }
}

// you can get the chunks given the buffer here!
console.dir(getPDF(buffer)); 

I want to store it in some global variable so that I can use it in different files. I have tried storing the elements of the array inside variables while inside the function, but no luck.

I don't think you can store the chunks as a global variable though as you would have to export the chunk (e.g module.exports = getPDF(buffer);), which is synchronous, while the function getPDF is asynchronous. So you have to use it within the same file. What I would do, is import the function instead and then pass it a different buffer in different js files where different pdf is required. Hope this helps.

Mμ.
  • 8,382
  • 3
  • 26
  • 36