3

I'm using puppeteer to try and access the 'aria-label' attribute from the query below but it is returning results of type:

    JSHandle@node

Which means when I try to .getAttribute('aria-label') it is undefined.

I am trying to get a list of available dates from the flatpickr calendar.

Can anyone tell me how to do this correctly?

Thanks.

(async () => {

    const arias = await page.evaluate(() => {
        const results = document.querySelectorAll('span.flatpickr-day:not(.prevMonthDay):not(.nextMonthDay):is(.flatpickr-disabled)');
        dates = {};
        if (results.length) {
            for ( var i = 0; i < results.length; i++ ) {
                dates.push(results[i].getAttribute('aria-label'));
            }
        }
        return dates;
    });

})
regan
  • 305
  • 2
  • 15
  • can you confirm that your `dates` is an array in your original code? like this (as an empty object, `{}`) it would throw a TypeError. other than this, could you share the URL of the page with the flatpickr calendar? for me if I fix the `dates` type above the script works. – theDavidBarton Jul 17 '21 at 19:53
  • Please share a URL or sample representative markup for the page you're scraping. `dates` should be an array, not an object. Objects have no attribute `push`, which you'd see if you attached a listener to the browser console to check for errors. – ggorlen Jul 19 '21 at 15:16

1 Answers1

1

When you're running into problems with evaluate, the first steps are to add a listener to browser console.logs and try the code in the browser yourself.

Let's try running the code in the browser itself, without Puppeteer, so we can see if there are any errors:

const results = document.querySelectorAll("div");
dates = {};

if (results.length) { // unnecessary; if results.length is 0, the loop won't run
  for (var i = 0; i < results.length; i++) {
    dates.push(results[i].getAttribute('aria-label'));
  }
}

console.log(dates);
<div aria-label="foobar">baz</div>

This gives Uncaught TypeError: dates.push is not a function. You probably meant dates to be an array, not an object:

const results = document.querySelectorAll("div");
dates = [];

for (var i = 0; i < results.length; i++) {
  dates.push(results[i].getAttribute('aria-label'));
}

console.log(dates);
<div aria-label="foobar">baz</div>

Putting that into Puppeteer, we can shorten it to:

const puppeteer = require("puppeteer");

let browser;
(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  await page.setContent('<div aria-label="foobar">baz</div>');

  const arias = await page.evaluate(() => Array.from(
    document.querySelectorAll("div"), 
    e => e.getAttribute("aria-label")
  ));
  
  // or:
  //const arias = await page.$$eval("div", els => 
  //  els.map(e => e.getAttribute("aria-label"))
  //);

  console.log(arias); // => [ 'foobar' ]
})()
  .catch(err => console.error(err))
  .finally(async () => await browser.close())
;

I don't have a flatpickr handy so I'm assuming your selector is valid if substituted for "div" in the above code, along with the actual site you're scraping.

ggorlen
  • 44,755
  • 7
  • 76
  • 106