1

In my quest to create the perfect string result = browser.Browse(url) method, I have created a simple class library to demonstrate CefSharp. The code for that is:

public class CefSharpHeadlessBrowser
{
    public CefSharpHeadlessBrowser()
    {
        Cef.Initialize(new CefSettings { CachePath = "cache" }, false, true);
    }

    public string Browse(string url)
    {
        Task<string> result;

        var browserSettings = new BrowserSettings { WindowlessFrameRate = 1 };

        using (var browser = new ChromiumWebBrowser(url, browserSettings))
        {
            browser.WaitForBrowserToInitialize();

            browser.LoadPageAsync();

            // Wait awhile for Javascript to finish executing.
            Thread.Sleep(2000);

            result = browser.GetSourceAsync();
            Thread.Sleep(100);
        }
        return result.Result;
    }
}

public static class CefExtensions
{
    public static void WaitForBrowserToInitialize(this ChromiumWebBrowser browser)
    {
        while (!browser.IsBrowserInitialized)
        {
            Task.Delay(100);
        }
    }

    public static Task LoadPageAsync(this IWebBrowser browser)
    {
        var tcs = new TaskCompletionSource<bool>();

        EventHandler<LoadingStateChangedEventArgs> handler = null;
        handler = (sender, args) =>
        {
            if (!args.IsLoading)
            {
                browser.LoadingStateChanged -= handler;
                tcs.TrySetResult(true);
            }
        };

        browser.LoadingStateChanged += handler;
        return tcs.Task;
    }
}

This is the test harness, in a separate console project that references the CefSharpHeadlessBrowser project:

class Program
{
    static void Main(string[] args)
    {
        const string searchUrl = "https://www.google.com";

        var browser = new CefSharpHeadlessBrowser();

        var result = browser.Browse(searchUrl);
        Console.Write(result);
    }
}

This actually works; it gets the HTML page source properly and displays it in the console window, just as it should. But here's the problem: the console program hangs after displaying the page source. It should exit immediately. Which must mean that I'm doing something wrong with the asynchronous operations and causing a deadlock.

What could be the issue?

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • Are you saying that the `Main` method returns but the application remains open? Could there be another foreground thread running? – Yacoub Massad Apr 28 '16 at 18:58
  • @YacoubMassad: That's a good clue. I think I've solved it; answer forthcoming shortly. – Robert Harvey Apr 28 '16 at 18:59
  • As you've already posted your missing a call to `Cef.Shutdown()`. A more complete example is https://github.com/cefsharp/CefSharp/blob/cefsharp/49/CefSharp.OffScreen.Example/Program.cs – amaitland Apr 28 '16 at 21:10
  • @amaitland: Thanks. You already provided a better example [here](http://stackoverflow.com/questions/35471261/using-cefsharp-offscreen-to-retrieve-a-web-page-that-requires-javascript-to-rend#comment58655270_35471261). The tricky part was converting all of those async calls to something that would run synchronously and return the resulting HTML string. To shield my users from all the intricacies, I'm wrapping the whole thing in my own DLL and putting it and all of the CefSharp files into their own folder. – Robert Harvey Apr 28 '16 at 21:20
  • The `Task.Dely` in the while loop is pretty nasty. You should `await` on `GetSourceAsync()`. Is there are reason you don't just mark your `Browse` method as `async`, then perform `awaits`, you can provide nice encapsulation without sacrificing code quality. – amaitland Apr 28 '16 at 21:33
  • @amaitland: I don't want the `async` to bleed out of the library I'm writing. In the context where it will be used, we will always wait for the result, so the call to `Browse()` will always be blocking. – Robert Harvey Apr 28 '16 at 22:33
  • I honestly thing you need to reconsider, even if you provide an async version that returns a `Task` and a sync version that executes the `async` version and simply waits on that `Task`. You only need a min requirement of `.Net 4.0`. Anyways, we've gone off topic, do as you choose. – amaitland Apr 29 '16 at 03:56
  • @amaitland: I don't feel like I have a good enough understanding of the underlying asynchrony to make that work. For example, `Thread.Sleep(2000);` worked, but `Task.Delay(2000)` did not, and I don't really understand why that matters. Recall that part of this exercise was to wait long enough to allow Javascript to finish executing. Changing `browser.LoadPageAsync()` to `browser.LoadPageAsync().Wait(30000)` worked very well, allowing me to throw an exception if a timeout occurs loading the page. – Robert Harvey Apr 29 '16 at 04:30
  • @amaitland: In any case, I've heard rumors of a fully `async` version of CefSharp emerging some time in the future, and if that occurs, I would be more comfortable taking "async all the way." – Robert Harvey Apr 29 '16 at 04:33
  • Rule of thumb: avoid block thread pool threads, otherwise new threads will be spawned if starvation detected (1 thread in 0.5 second), and when code becomes more complicated there is exist non-zero chances to get deadlock. Under pressure it is possible to obtain completely blocked thread pool, regardless that it is well guarded from this. – Dmitry Azaraev Apr 29 '16 at 06:08
  • @fddima: I don't see how threadpool threads applies here. Regarding deadlock, that's why I intend to keep this code as simple as possible. – Robert Harvey Apr 29 '16 at 06:10
  • It is not exactly about this code, anyway it completely depends on client/library implementation. – Dmitry Azaraev Apr 29 '16 at 06:27
  • @RobertHarvey I've heard rumors of a more `async` friendly `CefGlue` version, I don't currently have any plans to rewrite that particular area of `CefSharp`. – amaitland Apr 29 '16 at 07:34
  • @amaitland yep, i'm working on cefglue next gen, current version is ok, but has some by-design defects what i'm should fix. New views api will not be available in current versions, and some other issues, like extended lifetime of cef objects. it is not critical issues, so next gen will not be available soon. may be i'm finish it more close to 53, 55 or 57 chrome release. – Dmitry Azaraev May 02 '16 at 09:51
  • @fddima Sounds interesting, I'll have a look at the code when it's released :) – amaitland May 02 '16 at 11:26

1 Answers1

1

CefSharp has a Shutdown command; I was able to solve the problem by adding the following method to the CefSharpHeadlessBrowser class:

    public void Shutdown()
    {
        Cef.Shutdown();
    }

And then changing the Test Harness to:

class Program
{
    static void Main(string[] args)
    {
        const string searchUrl = "https://www.google.com";

        var browser = new CefSharpHeadlessBrowser();

        var result = browser.Browse(searchUrl);
        Console.WriteLine(result);
        browser.Shutdown();  // Added 
    }
}

This undoubtedly frees up any remaining threads that are running.

I'll probably make the class IDisposable and wrap the calling code in a using statement.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501