0

I'm using the CefSharp browser to grab images off of an Html page and convert them to System.Drawing.Bitmaps. To do so, I've implemented a BrowserRequestHandler and ImageResourceRequestHandler. The ImageResourceRequestHandler.OnResourceLoadComplete() publishes an event which pushes the web page images to my Winforms app. This works very well, but I'm missing 2 pieces of information about the images.

First, the Winforms app doesn't know when it has received the all the images for the page. LoadingStateChanged.IsLoading transitions to false well before all images have been pushed. Is there any way to know when the last image resource has been pushed?

Second, the Winforms app gets the images in a random order. I'd like to know what their position is in document order (e.g. 'image 3, image 4, image 1 ...', regardless of the order in which they are received). Is there any way to fetch this info?

d ei
  • 493
  • 4
  • 16
  • 1
    As for order there will always be a random order. You'd have to get the order from the html via JavaScript. If the website dynamically loads images through JavaScript then what you are seeing is expected. There's no built in event. You can use a timer and reset it upon every complete response, eventually the responses will stop and the timer event will fire. – amaitland Jan 16 '22 at 04:08
  • I'm getting the image elements in order via javascript, but the only way I've found to correlate them to the resource images is by comparing the href to the resource Url. They don't always match up. I suspect due to redirects and javascript interactions. I've been avoiding a timer because it's not deterministic, but I suppose I'll have to rely on it. thanks for the quick response. – d ei Jan 16 '22 at 16:13
  • That's the nature of web browsers, some sites, specially those with a lot of advertising may never stop loading resources. Only if JavaScript was disabled could you tell with certintanty the page has finished loading. – amaitland Jan 16 '22 at 21:09

1 Answers1

0

Take a look to my response in this question: https://stackoverflow.com/a/72011902/18452174

Once you can comunicate from JavaScript to your C# application, you can do that:

  • In JavaScript, search in the DOM all img tags and get the src attribute.
  • Put all Urls in an array.
  • Send that array (JSON encoded) to your C# application.
  • In your C# application decode the JSON and get al Urls (in order).
  • Download with WebRequest, for example

If you want optimize a bit the process, you can use the OffScreen browser and/or configure it to not download images (you are going to download from C#)

Victor
  • 2,313
  • 2
  • 5
  • 13