3

I'm slightly confused about the need (or otherwise) to add page listeners in puppeteer, in order to catch all errors.

In the following snippet, I've added handlers for the 'error' and 'pageerror' events. But won't these be caught by the try/catch around await page.goto()? What does the try/catch catch that isn't caught by the handlers, and vice versa? If the handlers are reacting to errors that aren't caught by try/catch, how do I handle those properly without getting an uncaughtException, since these errors happen outside the normal async/await flow? I really want to use the same error handling logic for all errors... i.e. await all errors, and handle any error the same way, not just some of them.

(async () => {
    try {
        const browser = await puppeteer.launch({ headless: true, args: ['--no-sandbox'] });
        const page = await browser.newPage();
        page.on('error', error => {
            console.error('Page on error:', error);
        });
        page.on('pageerror', error => {
            console.error('Page on pageerror:', error);
        });
        await page.goto('https://www.example.com');
        await page.screenshot({ path: './tmp/example.png' });
        await browser.close();
    } catch (err) {
        console.error('Try/catch error:', err);
    }
})();

I have tried wrapping the whole thing in a new Promise((resolve, reject) => { ... }) and then calling reject(error) in the page error handlers:

try {
    await new Promise((resolve, reject) => {
        (async () => {
            try {
                const browser = await puppeteer.launch({ headless: true, args: ['--no-sandbox'] });
                const page = await browser.newPage();
                page.on('error', error => {
                    console.error('Page on error:', error);
                    reject(error);
                });
                page.on('pageerror', error => {
                    console.error('Page on pageerror:', error);
                    reject(error);
                });
                await page.goto('https://www.example.com');
                await page.screenshot({ path: './tmp/example.png' });
                await browser.close();
                resolve();

            } catch (err) {
                reject(err);
            }
        })();
    });
} catch (err) {
    console.error('Try/catch error:', err);
}

But I'm still getting some odd uncaughtException errors (in my real code, which I can't yet reproduce in a simple example)... I can't figure out if the new Promise() approach is completely wrong in this context... or what the correct approach is.

drmrbrewer
  • 11,491
  • 21
  • 85
  • 181
  • 1
    Are you trying to catch Node (Puppeteer) errors or errors that might occur in the browser? These are two totally different processes. If the former, all of the normal error handling tools apply (`try`/`catch`/`.catch`). If the latter, then you can log them with `page.on` listeners (there's nothing to catch though because it's just a notice that an error occurred in another process). Regardless, [you don't need `new Promise` anywhere](https://stackoverflow.com/questions/23803743/what-is-the-explicit-promise-construction-antipattern-and-how-do-i-avoid-it). What's your actual use case/goal here? – ggorlen May 01 '23 at 14:10
  • This script looks fine--there is no chance of uncaught errors, so voting to close as missing a [mcve]. – ggorlen May 01 '23 at 14:17
  • @ggorlen the question makes it clear that I'm trying to catch *all* errors and handle them in a consistent way, regardless of *where* the error happens (node, puppeteer, browser, page). For instance in the example I give, if I'm responding to an http query by returning the generated `example.png`, if there is a problem in rendering the `.png`, I want to send an appropriate error response. I don't understand how I'm meant to `await` an error (so that I can return an appropriate response) when that error only appears asynchronously in a handler, and not in the `catch` block. – drmrbrewer May 01 '23 at 15:00
  • Is this code in an Express route? You are already handling all errors appropriately and so your use case/problem/goal is unclear to me. As mentioned, browser errors can't really be caught because they're in another process, so all you can do is log them in Node. I suppose you could promisify the `on` handler and reject it, but what's the point of that? The browser code handles those, and many of them are spurious anyway. To be clear, there are only two environments: Node and the browser. – ggorlen May 01 '23 at 15:56
  • 1
    "*I can't figure out if the new Promise() approach is completely wrong*" - yes, [don't put `async` code in the executor of a `new Promise`](https://stackoverflow.com/q/43036229/1048572)! "*I'm still getting some odd `uncaughtException` errors in my real code*" - what errors are these and where do they come from? – Bergi May 01 '23 at 18:37
  • @Bergi "you don't put async code in a new Promise()"... I was following [this answer](https://stackoverflow.com/a/61518855/4070848) from the same question you linked to. Maybe not recommended but still not completely illegal? – drmrbrewer May 01 '23 at 18:44
  • @drmrbrewer No, those are horrible answers. I have no idea how they have gotten to ~10 and ~60 upvotes respectively. It really is a pointless and error-prone pattern that should never be used. – Bergi May 01 '23 at 18:56
  • OK thanks @Bergi I'll adapt my thinking accordingly. Re "what errors are these and where do they come from?" ... they don't happen often and I haven't got to the bottom of where they're coming from... I thought I'd first take a step back and see whether the approach I'm taking is fundamentally wrong, and if so then correct it before doing any more evaluation. – drmrbrewer May 01 '23 at 19:03

1 Answers1

2

This request is a bit unusual, suggesting that with more information a better solution can be raised, but if you want to throw an error in Node based on a page.on handler, you could promisify those handlers and use Promise.race to stop the main line Puppeteer automation if the handler is triggered. Here's an example:

const puppeteer = require("puppeteer"); // ^19.11.1

const html = `<!DOCTYPE html><script>throw Error("hello world");</script>`;

(async () => {
  let browser;
  try {
    browser = await puppeteer.launch();
    const [page] = await browser.pages();
    const error = new Promise((resolve, reject) => {
      page.once("error", reject);
      page.once("pageerror", reject);
    });
    await Promise.race([
      error,
      (async () => {
        await page.setContent(html);
        await page.screenshot({path: "should_not_be_created.png"});
      })(),
    ]);
  }
  catch (err) {
    console.error(err.message); // => Error: hello world
  }
  finally {
    await browser?.close();
  }
})();
ggorlen
  • 44,755
  • 7
  • 76
  • 106
  • 1
    You may want to put these `page.once(…, reject)` *inside* the `new Promise` constructor executor callback so you won't need the weird assignment – Bergi May 01 '23 at 16:26
  • Or just import `once` from the *events* module and use `const error1 = once(page, 'error'), error2 = once(page, 'pageerror');` – Bergi May 01 '23 at 16:27
  • Thanks, this is exactly the sort of insight I was after. Maybe this is an X-Y problem situation, but I'm not yet convinced it is. Maybe on reflection I will agree that I don't need to be catching the browser errors (because they're in a separate process) but in my situation the user would be requesting a chart (yes, to a node Express server) and should be receiving an error response whether the chart generation failed in the node process or in the browser process... this distinction doesn't matter to them? Hence promisifying the handlers as per this answer isn't an unreasonable approach? – drmrbrewer May 01 '23 at 17:35
  • @drmrbrewer OK, that use case makes more sense, thanks for clarifying. On the other hand, you could potentially call `res.send` from all of the error handlers and achieve more or less the same result without having to unify all of the error handlng into a single `catch`. But maybe the unified approach makes sense if you want to be sure you're not callng `res.send` again and don't want to deal with a variable and an `if` to handle that. – ggorlen May 01 '23 at 17:49
  • Actually I'm wondering if my current approach, as mentioned in the OP where I said that "I have tried wrapping the whole thing in a `new Promise(...)`, is effectively the same as what you're suggesting. I've updated my question with the approach I've been using to try and unify the error handling... is that effectively the same as what you're suggesting? Is it a reasonable approach or is yours better? – drmrbrewer May 01 '23 at 18:38
  • @Bergi out of interest, where does `resolve/reject` fit into the `const error1 = once(page, 'error'), error2 = once(page, 'pageerror');` you mention? – drmrbrewer May 01 '23 at 18:46
  • @drmrbrewer I'm suggestng not promisifying anything. You can potentially call `res.send()` from the error handlers for page errors, and call `res.send()` as well for Node errors in the `catch` block. No extra promises needed. If you do choose to follow the promisification approach from my answer, I don't see benefit in promisifying anything more than the error handlers. Puppeteer's API is [already promise-based, so there's no point in adding another promise wrapper to it](https://stackoverflow.com/questions/23803743/what-is-the-explicit-promise-construction-antipattern-and-how-do-i-avoid-it). – ggorlen May 01 '23 at 18:52
  • @drmrbrewer Only certain `page.on` handlers aren't promise-based in Puppeteer, including the ones you're using. But 95% of the time you don't need to promisify them anyway, which is why I suggested that it might be an XY problem. You may have an exception here. – ggorlen May 01 '23 at 18:56
  • 1
    @drmrbrewer There would be no `reject`/`resolve` calls and no `new Promise` in that approach, just the `Promise.race(…)` call – Bergi May 01 '23 at 18:59
  • @drmrbrewer Looking again at your original promisified code, I don't see anything obviously wrong with it, but it's sort of hard to understand because of the layered promises so maybe I'm missing something. I can't explain why you'd see uncaught errors with that. Even if it does work, I prefer my own because the promisification is scoped tightly and there are fewer `try`/`catch` layers. One thing to deal with either way is making sure to run `browser.close()` regardless of an error, otherwise your route will leak memory. – ggorlen May 01 '23 at 21:16
  • Thanks, @ggorlen... also my approach uses `async` inside a `new Promise()` which @Bergi has made clear is a "bad thing". I agree that your approach is also easier to understand because it's more clear that you're just promisifying the listeners and putting those into a `Promise.race()` along with the node method itself, to catch whichever errors out first. – drmrbrewer May 02 '23 at 07:48