2

I'm trying to create a node app that requires a URL from the user, the URL is then passed to scrape.js and using puppeteer, scrapes certain fields, and then passes the data back to app.js in a json format (so that I can then upset it into a doc). But what I receive is the entire ServerResponse and not the data in a json format as I'm intending.

I was hoping someone with more experience could shed some light. Here is what I have so far:

// app.js

const scrape = require('./scrape');
const router = express.Router();

router.get( '/', ( req, res ) => {
    const url = req.body.url;

    const item = new Promise((resolve, reject) => {
        scrape
            .scrapeData()
            .then((data) => res.json(data))
            .catch(err => reject('Scraping failed...'))
   })
});
// scrape.js

const puppeteer = require('puppeteer');

const scrapeData = async () => {
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    await page.setViewport({ width: 360, height: 640 });
    await page.goto(url);

    let scrapedData = await page.evaluate(() => {
    let scrapedDetails = [];

    let elements = document.querySelectorAll('#a-page');

    elements.forEach(element => {
      let detailsJson = {};

      try {
        detailsJson.title = element.querySelector('h1#title').innerText;
        detailsJson.desc = element.querySelector('#description_box').innerText;
      } catch (exception) {}

    scrapedDetails.push(detailsJson);
  });

  return scrapedDetails;
  }));

  // console.dir(scrapeData) - logs the data successfully.

};

module.exports.scrapeData = scrapeData
John107
  • 2,157
  • 3
  • 20
  • 36
  • 1
    It's hard to follow due to the indentation. But is it possible that the `scrapeData` function isn't returning anything? – hardkoded Aug 19 '19 at 20:08
  • 1
    @hardkoded, thanks for the reply! I can verify that it's returning data by console logging scrapeData in `scrape.js` which logs the data. Hang on I'll see if I can codesandbox an example! – John107 Aug 19 '19 at 20:13
  • 1
    lmao, never mind - TIL that you can't use puppeteer on codesandbox. – John107 Aug 19 '19 at 21:09
  • Your `Promise` never has `resolve` called, and it's not necessary anyway since `scrapeData()` [already returns a promise](https://stackoverflow.com/questions/23803743/what-is-the-explicit-promise-construction-antipattern-and-how-do-i-avoid-it). – ggorlen Aug 22 '23 at 03:16

1 Answers1

1

You have a naming problem. scrape.js is exporting the scrapeData function. Inside that function, you declared a scrapedData variable, which is not the same thing.

Where you put a:

console.dir(scrapeData) - logs the data successfully.

Add

return scrapeData;

That should solve your issue.

hardkoded
  • 18,915
  • 3
  • 52
  • 64