63

I am using Puppeteer to try to take a screenshot of a website after all images have loaded but can't get it to work.

Here is the code I've got so far, I am using https://www.digg.com as the example website:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://www.digg.com/');

    await page.setViewport({width: 1640, height: 800});

    await page.evaluate(() => {
        return Promise.resolve(window.scrollTo(0,document.body.scrollHeight));
    });

    await page.waitFor(1000);

    await page.evaluate(() => {
        var images = document.querySelectorAll('img');

        function preLoad() {

            var promises = [];

            function loadImage(img) {
                return new Promise(function(resolve,reject) {
                    if (img.complete) {
                        resolve(img)
                    }
                    img.onload = function() {
                        resolve(img);
                    };
                    img.onerror = function(e) {
                        resolve(img);
                    };
                })
            }

            for (var i = 0; i < images.length; i++)
            {
                promises.push(loadImage(images[i]));
            }

            return Promise.all(promises);
        }

        return preLoad();
    });

    await page.screenshot({path: 'digg.png', fullPage: true});

    browser.close();
})();
Petar Vasilev
  • 4,281
  • 5
  • 40
  • 74
  • ive tried basically all of the solutions and they work for Img elements, but they don't work for background-image css.... https://stackoverflow.com/questions/74296599/puppeteer-screenshot-full-page-omitting-background-images – fotoflo Nov 02 '22 at 23:21

5 Answers5

100

There is a built-in option for that:

await page.goto('https://www.digg.com/', {"waitUntil" : "networkidle0"});

networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms

networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.

Of course it won't work if you're working with endless-scrolling-single-page-applications like Twitter.

Puppeteer GitHub issue #1552 provides explanation for the motivation behind networkidle2.

Abdull
  • 26,371
  • 26
  • 130
  • 172
Vaviloff
  • 16,282
  • 6
  • 48
  • 56
  • In case of digg.com some of the images are loaded only when you scroll down, do you know of a way to wait for the images to load after scrolling? – Petar Vasilev Sep 15 '17 at 07:59
  • 2
    I guess your solution will work, but - after studying how digg's home page works - I'll say you have to scroll little by little, whereas in your code you jump by almost a full page. Look in the source - there are lots of lazy-loading images that will only load if in the viewport. – Vaviloff Sep 15 '17 at 09:49
  • 3
    I think it should be: { waitUntil: "networkidle" } instead of {"waitUntil" : "networkidle"} – standac Sep 30 '17 at 00:47
  • 1
    In the latest puppeteer builds `networkidle` is deprecated and replaced with `networkidle0` & `networkidle2` https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagegobackoptions – Levi Dec 16 '17 at 01:17
  • Hi, everytime i click something it would load stuff, how can i wait for the next network idle , but there isn't any goto you see cuz it's a button click. – CodeGuru Feb 16 '19 at 03:25
  • Its loading image tags but not loading css background-images... any solution there? – fotoflo Nov 02 '22 at 23:38
30

Another option, actually evaluate to get callback when all images were loaded

This option will also work with setContent that doesn't support the wait networkidle0 option

await page.evaluate(async () => {
  const selectors = Array.from(document.querySelectorAll("img"));
  await Promise.all(selectors.map(img => {
    if (img.complete) return;
    return new Promise((resolve, reject) => {
      img.addEventListener('load', resolve);
      img.addEventListener('error', reject);
    });
  }));
})
Ben
  • 20,737
  • 12
  • 71
  • 115
Daniel Krom
  • 9,751
  • 3
  • 43
  • 44
  • 1
    Note https://stackoverflow.com/questions/23803743/what-is-the-explicit-promise-construction-antipattern-and-how-do-i-avoid-it – Benjamin Gruenbaum Apr 21 '18 at 07:25
  • @BenjaminGruenbaum yea but it's event emitter, npm that promisify it won't do exactly the same?, +thanks for the good edit – Daniel Krom Apr 21 '18 at 09:09
  • You can't promisify `EventTarget`s automatically yet as far as I know - but the rest doesn't need `new Promise` :) – Benjamin Gruenbaum Apr 21 '18 at 09:28
  • Note that unlike `networkidle` this will wait for all the images based on tags present in the DOM when the `evaluate` is called. So if scripts add more images asynchronously this won't work (you can in theory call it recursively but... meh). – Benjamin Gruenbaum Apr 21 '18 at 09:30
  • Thanks for the clarification, notice that on `setContent` for example, you can't use `networkidle`, at least not in a way I found – Daniel Krom Apr 21 '18 at 09:39
  • 1
    FYI this answer is out of date. `setContent` supports `waitUntil` now, which is very helpful. – brainbag Mar 22 '19 at 21:07
  • @brainbag During an experimentation, I found that `setContent` with `waitUntil: 'networkidle2'` will take as twice as long as `goto` with the same option. This is weird, because with out the option, then `setContent` will be faster than `goto`. – Bhoomtawath Plinsut Apr 18 '19 at 04:36
  • this waits for img query selectors, what about block elements with images that load via css or elements that load images with javascript? – fotoflo Oct 31 '22 at 07:11
  • https://stackoverflow.com/questions/74296599/puppeteer-screenshot-full-page-omitting-background-images – fotoflo Nov 02 '22 at 23:38
12

Wait for Lazy Loading Images

You may want to consider scrolling down first using a method such as Element.scrollIntoView() to account for lazy loading images:

await page.goto('https://www.digg.com/', {
  waitUntil: 'networkidle0', // Wait for all non-lazy loaded images to load
});

await page.evaluate(async () => {
  // Scroll down to bottom of page to activate lazy loading images
  document.body.scrollIntoView(false);

  // Wait for all remaining lazy loading images to load
  await Promise.all(Array.from(document.getElementsByTagName('img'), image => {
    if (image.complete) {
      return;
    }

    return new Promise((resolve, reject) => {
      image.addEventListener('load', resolve);
      image.addEventListener('error', reject);
    });
  }));
});
Grant Miller
  • 27,532
  • 16
  • 147
  • 165
  • Hi Grant, This doesn't fetch all the images. For example, try for the URL given https://www.insight.com/en_US/search.html?qtype=all&q=HP%20Printers – Amitesh Rai Jul 31 '20 at 12:55
-1

I'm facing the exact same issue. I have a feeling the solution will involve using:

await page.setRequestInterceptionEnabled(true);

page.on('request', interceptedRequest => {
    //some code here that adds this request to ...
    //a list and checks whether all list items have ...
    //been successfully completed!
});

https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagesetrequestinterceptionenabledvalue

Wissa
  • 1
  • 1
-1

I found a solution which is applicable to multiple sites using the page.setViewPort(...) method as given below:

const puppeteer = require('puppeteer');

async(() => {
    const browser = await puppeteer.launch({
        headless: true, // Set to false while development
        defaultViewport: null,
        args: [
            '--no-sandbox',
            '--start-maximized', // Start in maximized state
        ],
    });

    const page = await = browser.newPage();
    await page.goto('https://www.digg.com/', {
        waitUntil: 'networkidle0', timeout: 0
    });

    // Get scroll width and height of the rendered page and set viewport
    const bodyWidth = await page.evaluate(() => document.body.scrollWidth);
    const bodyHeight = await page.evaluate(() => document.body.scrollHeight);
    await page.setViewport({ width: bodyWidth, height: bodyHeight });

    await page.waitFor(1000);
    await page.screenshot({path: 'digg-example.png' });
})();
Amitesh Rai
  • 866
  • 11
  • 21
  • `waitFor` is deprecated and will be removed in a future release: see https://github.com/puppeteer/puppeteer/issues/6214 for details. – Mooncake Dec 17 '21 at 15:27