4

To give you some background, many (if not all) websites load their images one by one, so if there are a lot of images, and/or you have a slow computer, most of the images wont show up. This is avoidable for the most part, however if you're running a script to exact image URLs, then you don't need to see the image, you just want its URL. My question is as follows:

Is it possible to trick a webpage into thinking an image is done loading so that it will start loading the next one?

Badasahog
  • 579
  • 2
  • 19

6 Answers6

5

Typically browser will not wait for one image to be downloaded before requesting the next image. It will request all images simultaneously, as soon as it gets the srcs of those images.

Are you sure that the images are indeed waiting for previous image to download or are they waiting for a specific time interval?

In case if you are sure that it depends on download of previous image, then what you can do is, route all your requests through some proxy server / firewall and configure it to return an empty file with HTTP status 200 whenever an image is requested from that site.

That way the browser (or actually the website code) will assume that it has downloaded the image successfully.

how do I do that? – Jack Kasbrack

That's actually a very open ended / opinion based question. It will also depend on your OS, browser, system permissions etc. Assuming you are using Windows and have sufficient permissions, you can try using Fiddler. It has an AutoResponder functionality that you can use.

(I've no affiliation with Fiddler / Telerik as such. I'm suggesting it only as an example and because I've used it in the past and know that it can be used for the aforementioned purpose. There will be many more products that provide similar functionality and you should use the product of your choice.)

Vivek Athalye
  • 2,974
  • 2
  • 23
  • 32
1

use a plugin called lazy load. what it does is it will load the whole webpage and will just load the image later on. it will only load the image when the user scroll on it.

kapitan
  • 2,008
  • 3
  • 20
  • 26
  • do you have a link? – Badasahog Nov 17 '18 at 22:41
  • check this link, ive used it before, very simple to implement: http://jquery.eisbehr.de/lazy/ – kapitan Nov 18 '18 at 11:41
  • is this a chrome extension? how do I set it up? – Badasahog Nov 18 '18 at 17:57
  • you just have to download and include the js file (jquery.lazy.min.js) on your webpage, then give all of your images a class (example: ) then put this inside the script tag $(function() { $('.lazyimg').lazy(); }); - you also need jquery of course =) – kapitan Nov 19 '18 at 01:03
  • this isn't for a website I'm making, this is for websites I visit. Does LazyLoad come as an extension as well? – Badasahog Nov 20 '18 at 20:17
  • 1
    oh, apologies, now i understand. then you should use the Chrome extension i am using for that exact purpose. search for "Text Mode", it's a google chrome extension. what it does is that it simply doesnt load any images of any webpage. i used this at work so that i can browse any website more safely. – kapitan Nov 21 '18 at 00:17
  • Awesome, I'll check it out first thing tomorrow. Thanks – Badasahog Nov 21 '18 at 01:19
0

To extract all image URLs to a text file maybe you could use something like this, If you execute this script inside any website it will list the URLs of the images

document.querySelectorAll('*[src]').forEach((item) => {
    const isImage = item.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);

    if (isImage) console.log(item.src); 
});

You could also use the same idea to read Style from elements and get images from background url or something, like that:

document.querySelectorAll('*').forEach((item) => {
    const computedItem = getComputedStyle(item);

    Object.keys(computedItem).forEach((attr) => {
        const style = computedItem[attr];
        const image = style.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
        if (image) console.log(image[0]);
    });
});

So, at the end of the day you could do some function like that, which will return an array of all images on the site

function getImageURLS() {
  let images = [];
  document.querySelectorAll('*').forEach((item) => {
    const computedItem = getComputedStyle(item);

    Object.keys(computedItem).forEach((attr) => {
        const style = computedItem[attr];
        const image = style.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
        if (image) images.push(image[0]);
    });
  });

  document.querySelectorAll('*[src]').forEach((item) => {
    const isImage = item.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);

    if (isImage) images.push(item.src); 
  });
  return images;
}

It can probably be optimized but, well you get the idea..

Elias Dal Ben
  • 429
  • 1
  • 3
  • 6
  • The issue is that the site I'm trying to get the images from doesn't load an image or it's URL until the previous one is done loading. I'm looking for a way to make the site think it's images are done loading, so that I can keep all the image URLs. – Badasahog Nov 11 '18 at 21:22
  • yes, I get the idea, but if I try to run this code, or similar code before the page is done loading, I get a file called blank.gif. – Badasahog Nov 13 '18 at 17:04
0

If you just want to extract images once. You can use some tools like

1) Chrome Extension

2) Software

3) Online website

If you want to run it multiple times. Probably use the above code https://stackoverflow.com/a/53245330/4674358 wrapped in if condition

if(document.readyState === "complete") {
  extractURL();
}
else {
  //Add onload or DOMContentLoaded event listeners here: for example,
  window.addEventListener("onload", function () {
    extractURL();
  }, false);
  //or
  /*document.addEventListener("DOMContentLoaded", function () {
    extractURL();
  }, false);*/
}

extractURL() {
  //code mentioned above
}
Shubham
  • 1,288
  • 2
  • 11
  • 32
  • I created my own chrome extension to do it. The problem is that you can't extract images that aren't loaded – Badasahog Nov 15 '18 at 23:45
  • 1
    You can very well get the full DOM in chrome extension. then parse it and load urls using above code.There is no need to wait for images to download. Check out https://stackoverflow.com/a/7641233/4674358 – Shubham Nov 16 '18 at 00:08
0

You want the "DOMContentLoaded" event docs. It fires as soon as the document is fully parsed, but before everything has been loaded.

let addIfImage = (list, image) => image.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g) ?
    [image.src, ...list] :
    list;

let getSrcFromTags= (tag = 'img') => Array.from(document.getElementsByTagName(tag))
    .reduce(addIfImage, []);

if (document.readyState === "loading") {
    document.addEventListener("DOMContentLoaded", doSomething);
} else {  // `DOMContentLoaded` already fired
    doSomething();
}
Malisbad
  • 189
  • 1
  • 7
0

I am using this, works as expected:

var imageLoading = function(n) {
    var image = document.images[n];
    var downloadingImage = new Image();

    downloadingImage.onload = function(){
        image.src = this.src;
        console.log('Image ' + n + ' loaded');

        if (document.images[++n]) {
            imageLoading(n);
        }
    };

    downloadingImage.src = image.getAttribute("data-src");
}

document.addEventListener("DOMContentLoaded", function(event) {
    setTimeout(function() {
        imageLoading(0);
    }, 0);
});

And change every src attribute of image element to data-src

ribrow
  • 11
  • 1