3

I'm trying to make image of webpage, but some pages shows me as white page.

In Registry editor browse \HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION\ and add there this:

  • WindowsFormsApp1.exe with decimal value 11000

  • WindowsFormsApp1.vshost.exe with decimal value 11000

Here is my code:

using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;

namespace WindowsFormsApp1
{
public partial class Form1 : Form
{
    Dictionary<Uri, Bitmap> browserShots = new Dictionary<Uri, Bitmap>();
    WebBrowser browser = new WebBrowser();
    public Form1()
    {
        InitializeComponent();
        browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
    }
    //=========================================MADE BY JIMY====================================
    private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        var browser = sender as WebBrowser;
        if (browser.ReadyState != WebBrowserReadyState.Complete) return;

        var bitmap = WebBrowserExtender.DrawContent(browser);
        if (bitmap != null)
        {
            if (!browserShots.ContainsKey(browser.Url))
                browserShots.Add(browser.Url, bitmap);
            else
            {
                browserShots[browser.Url]?.Dispose();
                browserShots[browser.Url] = bitmap;
            }
            // Show the Bitmap in a  PictureBox control, eventually
            pictureBox1.Image = browserShots[browser.Url];
        }
    }
    public class WebBrowserExtender
    {
        public static Bitmap DrawContent(WebBrowser browser)
        {
            if (browser.Document == null) return null;
            Size docSize = Size.Empty;
            Graphics g = null;
            var hDc = IntPtr.Zero;

            try
            {
                docSize.Height = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollHeight;
                docSize.Width = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollWidth;
                docSize.Height = Math.Max(Math.Min(docSize.Height, 32750), 1);
                docSize.Width = Math.Max(Math.Min(docSize.Width, 32750), 1);

                var previousSize = browser.ClientSize;
                browser.ClientSize = new Size(docSize.Width, docSize.Height);

                var bitmap = new Bitmap(docSize.Width, docSize.Height, PixelFormat.Format32bppArgb);
                g = Graphics.FromImage(bitmap);
                var rect = new RECT(0, 0, bitmap.Width, bitmap.Height);
                hDc = g.GetHdc();
                var view = browser.ActiveXInstance as IViewObject;
                view.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hDc, ref rect, IntPtr.Zero, IntPtr.Zero, 0);
                browser.ClientSize = previousSize;
                return bitmap;
            }
            catch
            {
                // This catch block is like this on purpose: nothing to do here
                return null;
            }
            finally
            {
                if (hDc != null) g?.ReleaseHdc(hDc);
                g?.Dispose();
            }
        }

        [ComImport]
        [Guid("0000010D-0000-0000-C000-000000000046")]
        [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
        interface IViewObject
        {
            void Draw(uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd,
                      IntPtr hdcTargetDev, IntPtr hdcDraw, ref RECT lprcBounds,
                      [In] IntPtr lprcWBounds, IntPtr pfnContinue, uint dwContinue);
        }

        [StructLayout(LayoutKind.Sequential, Pack = 4)]
        struct RECT
        {
            public int Left;
            public int Top;
            public int Right;
            public int Bottom;
            public RECT(int left, int top, int width, int height)
            {
                Left = left; Top = top; Right = width; Bottom = height;
            }
        }
    }
    //=========================================MADE BY JIMY====================================}

    private void button1_Click(object sender, EventArgs e)
    {
        browser.Navigate(textBox1.Text, null, null, "User-Agent: User agent");
    }
}
}

Windows From Designer

Reza Aghaei
  • 120,393
  • 18
  • 203
  • 398
  • Read the notes here: [How to get an HtmlElement value inside Frames/IFrames?](https://stackoverflow.com/a/53218064/7444103) – Jimi Mar 17 '20 at 14:06
  • Thanks, but that didnt solved my problem – Jiří Poštulka Mar 17 '20 at 16:09
  • You didn't read nor applied what is described in those notes. Frames/IFrames are not taken into consideration and this: `Thread thread = new Thread(...) ... while (browser.ReadyState != WebBrowserReadyState.Complete) { Application.DoEvents(); }` is built to fail in a number of different ways. – Jimi Mar 17 '20 at 20:40
  • I'm sorry, I guess I misunderstood your advice. I tried `if (browser.ReadyState! = WebBrowserReadyState.Complete) return;` but nothing has changed. Now I tried to remove the`thread`, but also no result, and if I remove it `while (browser.ReadyState! = WebBrowserReadyState.Complete) {Application.DoEvents ();}` I get crashed. changes of code are above. – Jiří Poštulka Mar 18 '20 at 07:28

2 Answers2

3

In order to print the Html content of a WebBrowser Control, there are a few points that need to be considered:

  1. We need to use the WebBrowser's DocumentCompleted event to determine when the current Document is loaded and rendered
  2. A single Document may (will) contain more that one sub-Document, usually contained inside Frames/IFrames. Each IFrame contains its own Document: when a Document contained in an IFrame is loaded, the DocumentCompleted is reaised. This means that the event can and will be raised multiple times when the WebBrowser navigates to a URL.

    The notes here explain more: How to get an HtmlElement value inside Frames/IFrames?

  3. The managed properties of the WebBrowser don't always reflect the DOM's real values. For example, the actual dimensions of the Html Document, when the rendering is completed, are not reflected anywhere, so we need to get those measures from the DOM ourselves. The current DOM rendered dimensions are referenced by:

    [WebBrowser].Document.DomDocument.documentElement.scrollHeight;
    [WebBrowser].Document.DomDocument.documentElement.scrollWidth;
    

    See: Measuring Element Dimension and Location with CSSOM in Windows Internet Explorer

  4. The WebBrowser Control DrawToBitmap() method is derived from Control but it's not actually implemented as we could expect. The same applies to other Controls: the RichTextBox is known to print blank content when this method is used.

  5. A Html Document may be larger than the maximum Size supported by a Bitmap. There is also a more subtle memory limit: the Bitmap object needs to store its content in a contiguous memory space, so the limit in Size of a Bitmap is actually hard to pre-determine and may cause exceptions when we might not expect it.
  6. The WebBrowser control's Emulation Feature must be set to Internet Explorer 11. See:
    How can I get the WebBrowser control to show modern contents?
    Web browser control emulation issue (FEATURE_BROWSER_EMULATION)

To proceed, first subscribe to DocumentCompleted event of the WebBrowser Control.

A Dictionary<Uri, Bitmap> is used here to store the Bitmap representing the Html content of URLs visited in a session.
When the DocumentCompleted event is raised, we add a new element to the Dictionary when the current URL has never been visited before.
If the Uri is already stored, we updated the related Bitmap object, so only the most recent snapshot of a Html Document is present in the collection.

I'm using a support class to handle the Bitmaps creation and to declare the native COM Interface used to generate the Bitmap from the current ISurfacePresenter.
Since the WebBrowser control is forced to use VIEW_OBJECT_COMPOSITION_MODE_LEGACY as the CompositionMode for all sites, the internal GetPrintBitmap method calls the IViewObject Interface Draw() method in this situation, so do we.

To print the content (all the content) of the current Html Document, call the DrawContent(WebBrowser browser) static method of the WebBrowserExtender class:

Dictionary<Uri, Bitmap> browserShots = new Dictionary<Uri, Bitmap>();

private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    var browser = sender as WebBrowser;
    if (browser.ReadyState != WebBrowserReadyState.Complete) return;

    var bitmap = WebBrowserExtender.DrawContent(browser);
    if (bitmap != null) {
        if (!browserShots.ContainsKey(browser.Url)) {
            browserShots.Add(browser.Url, bitmap);
        }
        else {
            browserShots[browser.Url]?.Dispose();
            browserShots[browser.Url] = bitmap;
        }
        // Show the Bitmap in a  PictureBox control, eventually
        [PictureBox].Image = browserShots[browser.Url];
    }
}

The WebBrowserExtender support class:

using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
using System.Windows.Forms;

public class WebBrowserExtender
{
    public static Bitmap DrawContent(WebBrowser browser)
    {
        if (browser.Document == null) return null;
        Size docSize = Size.Empty;
        Graphics g = null;
        var hDc = IntPtr.Zero;

        try {
            docSize.Height = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollHeight;
            docSize.Width = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollWidth;

            var screenWidth = Screen.FromHandle(browser.Handle).Bounds.Width;
            docSize.Width = Math.Max(Math.Min(docSize.Width, screenWidth), 1);
            docSize.Height = Math.Max(Math.Min(docSize.Height, 32750), 1);

            var previousSize = browser.ClientSize;
            browser.ClientSize = new Size(docSize.Width, docSize.Height);

            var bitmap = new Bitmap(docSize.Width, docSize.Height, PixelFormat.Format32bppArgb);
            g = Graphics.FromImage(bitmap);
            var rect = new RECT(0, 0, bitmap.Width, bitmap.Height);
            hDc = g.GetHdc();
            var view = browser.ActiveXInstance as IViewObject;
            view.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hDc, ref rect, IntPtr.Zero, IntPtr.Zero, 0);
            browser.ClientSize = previousSize;
            return bitmap;
        }
        catch {
            // This catch block is like this on purpose: nothing to do here
            return null;
        }
        finally {
            if (hDc != null) g?.ReleaseHdc(hDc);
            g?.Dispose();
        }
    }

    [ComImport]
    [Guid("0000010D-0000-0000-C000-000000000046")]
    [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
    interface IViewObject
    {
        void Draw(uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd, 
                  IntPtr hdcTargetDev, IntPtr hdcDraw, ref RECT lprcBounds, 
                  [In] IntPtr lprcWBounds, IntPtr pfnContinue, uint dwContinue);
    }

    [StructLayout(LayoutKind.Sequential, Pack = 4)]
    struct RECT
    {
        public int Left;
        public int Top;
        public int Right;
        public int Bottom;

        public RECT(int left, int top, int width, int height)
        {
            Left = left; Top = top; Right = width; Bottom = height;
        }
    }
}

This is how it works:

The full Document is captured. Of course, the Bitmap can also be limited to a specific maximum/minimum size, to capture just a section of the Html Document.

WebBrowser ScreenShots:

Sample WinForms Project on Google Drive.

Jimi
  • 29,621
  • 8
  • 43
  • 61
  • Wow, many thanks to you. That is awesome, but still this didnt wait for my `on.load function()`... i have found [this](https://stackoverflow.com/a/18370524/12999914) solution which works for me, but without `MessageBox.Show();` it doesnt show properly. I have tried task and threat delay, but nothing works. Can you please know how to solve this problem ? Sorry for such stupid questions, I'm in the programming c # newbie – Jiří Poštulka Mar 19 '20 at 10:04
  • See the animation. It works like that (same exact code you see here). – Jimi Mar 19 '20 at 10:12
  • in my case it probably won't be a size problem, but it will be something else, because when I try it using the method I sent, everything works as it should only until I remove the line `MessageBox.Show("window.onload was fired");`. Which doesn't make sense to me – Jiří Poštulka Mar 19 '20 at 13:13
  • So, don't use that method and instead use what I posted here, since it works. Most important, don't use `Application.DoEvents()`. Ever. – Jimi Mar 19 '20 at 13:17
  • We didn't understand :) yours method doesn't work for me, I still see only the white page when I use it or maybe i used it wrong. what I sent works, but only with the messagebox, and when i want to make images of multiple pages,that messagebox is annoying. – Jiří Poštulka Mar 19 '20 at 13:26
  • I used it exactly as you mentioned it. I just added one function that's completely down in my first mentioned code, and it still doesn't work anyway. I mentioned what I added to the registers, but no result – Jiří Poštulka Mar 19 '20 at 13:49
  • So, yes, you modified the code (not the **lines** of code, maybe, but its functionality), that's why it doesn't work for you anymore. Follow the instruction and apply the code exactly as shown here. If you don't know exactly how, than ask about that, eventually. BTW, in your new edit, you create (`using (WebBrowser browser = new WebBrowser())` a new WebBrowser Control with a `using` statement. This means that this WebBrowser control will be disposed of as soon as the last instruction in the `using` block is completed. The event handler is never removed. – Jimi Mar 19 '20 at 13:59
  • Do yourself a favor: add a WebBrowser Control to a Form, make it subscribe to the `DocumentCompleted` event shown in the code here (and nothing else, most of all no `DoEvents()` anywhere), use a button to make it navigate to an Address and see what happens. Add a PictureBox and use it to show the newly created Bitmaps. – Jimi Mar 19 '20 at 14:02
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/209983/discussion-between-jiri-postulka-and-jimi). – Jiří Poštulka Mar 20 '20 at 07:56
0

try to set User Agent like this

browser.Navigate(url, null, null, "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0");
  • You should change it to `"User-Agent: User agent"` else whole UI gets disturbed – Lucifer Mar 17 '20 at 13:48
  • Thanks guys, solution of you works for [link](www.google.com), but this was only my example of page which i cant go to work and it is public. My real problem is my internal page which is only made from javascripts and vb script, which are called `window.onload=function()`. I was hopying that the problem is same as on google but it didnt work for that... also when i tried `browser.Document.Body.ScrollRectangle.Height` on my pages it returns 0 – Jiří Poštulka Mar 17 '20 at 15:36