0

I am using puppeteersharp for some scraping. I want to manipulate the Page within a different thread, for example to get the html of the page in periodic intervals (the logic is not important).

Every time I try to call puppeteer inside a Thread the execution is being stack in that line.

In this example code:

Browser _puppeteerBrowser = await PuppeteerSharp.Puppeteer.LaunchAsync(new LaunchOptions
{
    Headless = false,
    ExecutablePath = _chromePath
});

Page Page = (await _puppeteerBrowser.PagesAsync()).FirstOrDefault();

var task = Task.Run(async () =>
{
    var content = await Page.GetContentAsync(); // it never returns!!
    System.Console.WriteLine(content.Length);
});

await Page.GoBackAsync(); // it works fine

task.Wait(); // never ends because of the stack inside the thread

I tried different variations with Timer, Task, Thread but every time I tried to do something in puppeteer inside another thread it hangs. How can I fix this? And in general if we want to use the same Puppeteer in different thread (for example maybe we want to observe if something is changed in the browser outside of the 'main flow') how can this be done?

dkokkinos
  • 361
  • 2
  • 9
  • Why getting a page 10 times? – Jeroen van Langen May 22 '21 at 18:43
  • This is not the point. This is an example. The thing is that it hangs when called inside the thread. Also I moved the for because is confusing. – dkokkinos May 22 '21 at 19:02
  • 1
    Your task wraps an async call that wants to run a continuation in the same context where you've called `Task.Wait()`, with the latter call blocking the context and preventing the continuation from running. See duplicate. A quick fix might be to use `ConfigureAwait(false)` on the `Page.GetContentAsync()` call, but it would be better to just never call `Wait()` in the first place. There's not enough context in the question to be able to say exactly how you should make that sort of change though. – Peter Duniho May 22 '21 at 19:07
  • @PeterDuniho thank you for your response. The context is this. I want to check periodically the html of a site. But also I would like to execute other functions in this site with puppeteer. So normally I would have a thread/timer that makes the periodic check and the rest of the program(outside the thread/timer) execute some clicks or other staff. But inside the thread whatever call in puppeteer hangs. If I add await in Task.Run() it will work but the rest of the program will not execute until the thread is finished and this is not 'parallel' work. – dkokkinos May 22 '21 at 19:19
  • _"inside the thread whatever call in puppeteer hangs"_ -- nothing in the code you posted suggests to me that there's anything in Puppeteer that hangs. Rather, the async call you made cannot return to the await continuation while you've got the sync context blocked with the call to `task.Wait()`. That's exactly what the duplicate explains and shows how to fix. – Peter Duniho May 22 '21 at 19:25
  • @PeterDuniho so trying to understand I cannot use the same Puppeteer instance inside another thread (e.g. just checking for html changes) even if nowhere else in the code I call anything on this instance, or the same instance of Puppeteer can indeed be used inside another thread and my implementation of threading is wrong( keep in mind I tried with Task, Timer and other things that with other objects I wouldn't have a problem). – dkokkinos May 22 '21 at 19:41
  • _"I cannot use the same Puppeteer instance inside another thread"_ -- I cannot speak for the thread-safety of Puppeteer objects. You can consult their documentation for that. But the thread-safety doesn't have anything to do with the deadlock you've caused. This issue would happen with _any_ Task-async API on _any_ object. That you're using Puppeteer in this case is only coincidental. The problem is that the sync context for the `await` that you say never completes depends on the thread you've blocked with the call to `task.Wait()`. The `await` _can't_ complete, _because_ of the `task.Wait()`. – Peter Duniho May 22 '21 at 20:03

0 Answers0