3

I'm new to pupetteer and I'm trying to understand how it's actually working through some examples:

So basically what I'm trying to do in this example is to extract number of views of a Youtube video. I've written a js line on the Chrome console that let me extract this information:

document.querySelector('#count > yt-view-count-renderer > span.view-count.style-scope.yt-view-count-renderer').innerText

Which worked well. However when I did the same with my pupetteer code he doesn't recognize the element I queried.

const puppeteer = require('puppeteer')

const getData = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()

  await page.goto('https://www.youtube.com/watch?v=T5GSLc-i5Xo')
  
  await page.waitFor(1000)

  const result = await page.evaluate(() => {
    let views = document.querySelector('#count > yt-view-count-renderer > span.view-count.style-scope.yt-view-count-renderer').innerText
    return {views}
  })

  browser.close()
  return result
}

getData().then(value => {
  console.log(value)
})

I finally did it using ytInitialData object. However I'd like to understand the reason why my first code didn't work.

Thanks

Milan Hlinák
  • 4,260
  • 1
  • 30
  • 41
elkolotfi
  • 5,645
  • 2
  • 15
  • 19
  • 1
    seems the wait time wasn't enough, how about waiting till all requests are completed `page.goto( 'https://www.youtube.com/watch?v=T5GSLc-i5Xo', { waitUntil: 'networkidle2', timeout: 0 });`, then remove the `page.waitFor` – pariola Dec 23 '18 at 23:25
  • I'm trying to fully understand this Puppeteer code. Not sure if it's related, but what is the reason for using `let views = ....`? I mean why `let`. And what is the reason for returning `{views}` instead of just `views` ? Thanks in advance! – RocketNuts Jan 20 '21 at 16:19

2 Answers2

3

It seems that wait for 1000 is not enough.

Try your solution with https://try-puppeteer.appspot.com/ and you will see.

However if you try the following solution, you will get the correct result

const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto('https://www.youtube.com/watch?v=T5GSLc-i5Xo');

await page.waitForSelector('span.view-count');
const views = await page.evaluate(() => document.querySelector('span.view-count').textContent);
console.log('Number of views: ' + views);

await browser.close();
Milan Hlinák
  • 4,260
  • 1
  • 30
  • 41
0

Do not use hand made timeout to wait a page to load, unless you are testing whether the page can only in that amount of time. Differently from selenium where sometimes you do not have a choice other than using a timeout, with puppeteer you should always find some await function you can use instead of guessing a "good" timeout. As answered by Milan Hlinák, look into the page HTML code and figure out some HTML tag you can wait on, instead of using a timeout. Usually, wait for the HTML element(s) you test require in order to work properly. On you case, the span.view-count, as already answered by Milan Hlinák:

await page.waitForSelector('span.view-count');
Evandro Coan
  • 8,560
  • 11
  • 83
  • 144