2

I am trying to get the NaturalWidth and NaturalHeight of an image without loading it, in order to speed things up. Is there a way to do that? Thank you !

EDIT: I was told to share some code but I don't really know what to share.

Here is what I use to get all the sizes of images:

const images_datas = await this.page.$$eval('img', imgs => {
      var images_data = []
      var empty_images = 0
      imgs.forEach(img => {
        if(img.naturalWidth*img.naturalHeight == 0 || ( img.naturalHeight == 1 && img.naturalWidth == 1)){
          empty_images++
        } else {
          images_data.push({'url': img.src, 'width': img.naturalWidth, 'height' : img.naturalHeight, 'alt' : img.alt})
        }
      });
      return {'images_data': images_data, 'nb_empty_images': empty_images}
} );

And the code I use to prevent the images from loading.

await page.setRequestInterception(true);
page.on('request', request => {
    if (request.resourceType() === 'image')
      request.abort();
    else
      request.continue();
});

But the two codes do not work together...

  • Could you please share some code – Abderrahim Soubai-Elidrisi May 06 '19 at 15:41
  • 1
    Possible duplicate of [Get height/width of image in Javascript (ideally without loading the image at all)](https://stackoverflow.com/questions/1692500/get-height-width-of-image-in-javascript-ideally-without-loading-the-image-at-al) – Chris W. May 06 '19 at 15:43
  • 1
    This is not a duplicate in my opinion as the post is tagged "puppeteer". Therefore, this is not just about the client-side of JavaScript. – Thomas Dondorf May 06 '19 at 15:55
  • Ideally a way to do this is with a BE engine that calculates the values for you and gives you those values before loading, otherwise I don't think there's a way to know a value of not loading file. – Crisoforo Gaspar May 06 '19 at 16:06

1 Answers1

2

If you have control over the server, you can pass the size of the image as HTTP headers. Otherwise, you can only read the size of the image without loading it.

Reading the size of the image

The following code is a minimal example on how to read the size (in bytes) of the image without downloading it. It will abort any image requests and then instead do a HEAD request to only request the headers of the file to read the content-length header. Be aware that this only returns the total file size and not width or height.

const puppeteer = require('puppeteer');
const fetch = require('node-fetch');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.setRequestInterception(true);

    page.on('request', interceptedRequest => {
        if (interceptedRequest.resourceType() === 'image') {
            interceptedRequest.abort();
            const response = await fetch(interceptedRequest.url(), {
                method: 'HEAD'
            });
            if (response.ok) {
                const sizeOfImage = response.headers.get('content-length');
                // handle image size
            } else {
                // something went wrong...
            }
        } else {
            interceptedRequest.continue();
        }
    });
    await page.goto('...');
    await browser.close();
})();

Passing the size of the image as header

In case you have control of the backend/server, you can pass the size of the image as an header and the read it with the same code as given before. Just change the header from content-length to the headers sending the width and height.

As you did not say anything about the backend I'm assuming this is not possible. In case you have control over the backend and you are using Node.js as backend, you might want to read this question on how to read the image size with Node.js.


More is not possible without loading the image. In a case you do not control the server, but need to know the naturalHeight and naturalWidth properties of the image, you have to load the image.

Thomas Dondorf
  • 23,416
  • 6
  • 84
  • 105
  • It's no response.headers()['content-length'] in my case. The get function wasn't working here... – Tobi Jan 28 '20 at 15:48