0

I am building a project that parses PDFs.

I am having trouble with the order of returns of the functions below. Currently, the value I want is returned before the pdf buffer is completely parsed.

Order: Initialized variable is returned, and the rest of the code is run in the outer function, and then the page is fully parsed

Outer function

  // outer function iterates 
  const variable = await pdfPageParser(pdfBuffer, pageKey);

  // rest of the code 

Inner function

const pdfreader = require("pdfreader");

module.exports = async function pdfPageParser(
    pageBuffer, pageKey
) {
    let ENGR = false;
    let keyword = 'xyz';
 
     await new pdfreader.PdfReader().parseBuffer(pageBuffer,
        async function(err, item) {
            if (err) {
                console.log(err);
                return err
            } else if (!item) {
                console.log('Finish Parsing');
                return;
            } else if (item.text) {
                // check for keywords
                if (item.text.toLowerCase().includes(keyword)) {
                    ENGR = true;
                }
            };
        })

    return ENGR;
}

I would like the rest of the code to wait for the page to be completely parsed before continuing. How do I go about doing that?

I tried setting the return value of the await function but it returned undefined.

const ENGR = await new pdfreader.PdfReader().parseBuffer(pageBuffer,
        async function(err, item) {
            if (err) {
                console.log(err);
                return err
            } else if (!item) {
                console.log('Finish Parsing');
                return;
            } else if (item.text) {
                // check for keywords
                if (item.text.toLowerCase().includes(keyword)) 
                 return true
            };
        })
[...]
return ENGR

Thanks!

  • is parseBuffer actually expecting an async lambda? My guess is it isn't and it calls it expecting it to block, it doesn't block, and then the whole things returns before your async lambda is ever run. – nlta Dec 02 '21 at 00:09
  • Apparently `parseBuffer` doesn't return a promise. – Bergi Dec 02 '21 at 01:09

0 Answers0