2

We have a fairly complex code base in NodeJS that runs a lot of Promises synchronously. Some of them come from Firebase (firebase-admin), some from other Google Cloud libraries, some are local MongoDB requests. This code works mostly fine, millions of promises being fulfilled over the course of 5-8 hours.

But sometimes we get promises rejected due to external reasons like network timeouts. For this reason, we have try-catch blocks around all of the Firebase or Google Cloud or MongoDB calls (the calls are awaited, so a rejected promise should be caught be the catch blocks). If a network timeout occurs, we just try it again after a while. This works great most of the time. Sometimes, the whole thing runs through without any real problems.

However, sometimes we still get unhandled promises being rejected, which then appear in the process.on('unhandledRejection', ...). The stack traces of these rejections look like this, for example:

Warn: Unhandled Rejection at: Promise [object Promise] reason: Error stack: Error: 
    at new ApiError ([repo-path]\node_modules\@google-cloud\common\build\src\util.js:59:15)
    at Util.parseHttpRespBody ([repo-path]\node_modules\@google-cloud\common\build\src\util.js:194:38)
    at Util.handleResp ([repo-path]\node_modules\@google-cloud\common\build\src\util.js:135:117)
    at [repo-path]\node_modules\@google-cloud\common\build\src\util.js:434:22
    at onResponse ([repo-path]\node_modules\retry-request\index.js:214:7)
    at [repo-path]\node_modules\teeny-request\src\index.ts:325:11
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)

This is a stacktrace which is completely detached from my own code, so I have absolutely no idea where I could improve my code to make it more robust against errors (error message seems to be very helpful too).

Another example:

Warn: Unhandled Rejection at: Promise [object Promise] reason: MongoError: server instance pool was destroyed stack: MongoError: server instance pool was destroyed
    at basicWriteValidations ([repo-path]\node_modules\mongodb\lib\core\topologies\server.js:574:41)
    at Server.insert ([repo-path]\node_modules\mongodb\lib\core\topologies\server.js:688:16)
    at Server.insert ([repo-path]\node_modules\mongodb\lib\topologies\topology_base.js:301:25)
    at OrderedBulkOperation.finalOptionsHandler ([repo-path]\node_modules\mongodb\lib\bulk\common.js:1210:25)
    at executeCommands ([repo-path]\node_modules\mongodb\lib\bulk\common.js:527:17)
    at executeLegacyOperation ([repo-path]\node_modules\mongodb\lib\utils.js:390:24)
    at OrderedBulkOperation.execute ([repo-path]\node_modules\mongodb\lib\bulk\common.js:1146:12)
    at BulkWriteOperation.execute ([repo-path]\node_modules\mongodb\lib\operations\bulk_write.js:67:10)
    at InsertManyOperation.execute ([repo-path]\node_modules\mongodb\lib\operations\insert_many.js:41:24)
    at executeOperation ([repo-path]\node_modules\mongodb\lib\operations\execute_operation.js:77:17)

At least this error message says something.

All my Google Cloud or MongoDB calls have await and try-catch blocks around them (and the MongoDB reference is recreated in the catch block), so if the promise were rejected inside those calls, the error would be caught in the catch block.

A similar problem sometimes happens in the Firebase library. Some of the rejected promises (e.g. because of network errors) get caught by our try-catch blocks, but some don't, and I have no possibility to improve my code, because there is no stack trace in that case.

Now, regardless of the specific causes of these problems: I find it very frustrating that the errors just happen on a global scale (process.on('unhandledRejection', ...), instead of at a location in my code where I can handle them with a try-catch. This makes us lose so much time, because we have to restart the whole process when we get into such a state.

How can I improve my code such that these global exceptions do not happen again? Why are these errors global unhandled rejections when I have try-catch blocks around all the promises?

It might be the case that these are the problems of the MongoDB / Firebase clients: however, more than one library is affected by this behavior, so I'm not sure.

Jonas Sourlier
  • 13,684
  • 16
  • 77
  • 148

2 Answers2

2

a stacktrace which is completely detached from my own code

Yes, but does the function you call have proper error handling for what IT does?
Below I show a simple example of why your outside code with try/catch can simply not prevent promise rejections

//if a function you don't control causes an error with the language itself, yikes

//and for rejections, the same(amount of YIKES) can happen if an asynchronous function you call doesn't send up its rejection properly
//the example below is if the function is returning a custom promise that faces a problem, then does `throw err` instead of `reject(err)`)

//however, there usually is some thiAPI.on('error',callback) but try/catch doesn't solve everything
async function someFireBaseThing(){
  //a promise is always returned from an async function(on error it does the equivalent of `Promise.reject(error)`)
  //yet if you return a promise, THAT would be the promise returned and catch will only catch a `Promise.reject(theError)`
  
  return await new Promise((r,j)=>{
    fetch('x').then(r).catch(e=>{throw e})
    //unhandled rejection occurs even though e gets thrown
    //ironically, this could be simply solved with `.catch(j)`
    //check inspect element console since stackoverflow console doesn't show the error
  })
}
async function yourCode(){
  try{console.log(await someFireBaseThing())}
  catch(e){console.warn("successful handle:",e)}
}
yourCode()

Upon reading your question once more, it looks like you can just set a time limit for a task and then manually throw to your waiting catch if it takes too long(because if the error stack doesn't include your code, the promise that gets shown to unhandledRejection would probably be unseen by your code in the first place)

function handler(promise,time){ //automatically rejects if it takes too long
  return new Promise(async(r,j)=>{
    setTimeout(()=>j('promise did not resolve in given time'),time)
    try{r(await promise)} catch(err){j(err)}
  })
}
async function yourCode(){
  while(true){ //will break when promise is successful(and returns)
    try{return await handler(someFireBaseThing(...someArguments),1e4)}
    catch(err){yourHandlingOn(err)}
  }
}
The Bomb Squad
  • 4,192
  • 1
  • 9
  • 17
  • Thanks, but if that's the problem I'm facing, there's not much I can do, right? (apart from submitting a pull request to the lib maintainers). So this is similar to a function not bubbling up its internal rejections to the caller. – Jonas Sourlier Mar 20 '22 at 21:07
  • well, you can use your `process.on('unhandledRejection',smartCallBack)` to work with stuff, and you can have it determine which handler to call using the promise itself as a weakmap since the callback gets the promise rejected it seems like, according to [this](https://nodejs.org/api/process.html#event-unhandledrejection) – The Bomb Squad Mar 20 '22 at 23:47
  • Sorry, don't understand. I know what a weakmap is, but I have absolutely no clue how to use them to solve my scenario. Can you please provide a bit more detail? – Jonas Sourlier Mar 21 '22 at 07:56
  • I'm having a hard time replicating the rejection as I did on browser, doing the same thing on nodejs is only throwing `uncaughtException` instead of `unhandledRejection`, but the plan would be(in theory) to make a `TRY` function which takes a `Promise` and an `onFail` callback to then use the promise caught in `unhandledRejection` to pass it back to the given `onFail` function.. in short, a custom try/catch system – The Bomb Squad Mar 21 '22 at 11:35
  • Sorry, but I still don't understand. Can you give me an example of such a custom try/catch system? – Jonas Sourlier Mar 21 '22 at 12:25
  • well, I gave something else, and if the unhandled rejection's stack trace doesn't include your code, using a weakmap to see where the promise "belonged to" probably won't make sense :{ however you can implement a time limit to determine if the promise is simply not returning – The Bomb Squad Mar 21 '22 at 12:30
  • Good idea! Gonna try that. – Jonas Sourlier Mar 21 '22 at 13:15
  • I think you should put the `setTimeout` call before the try-catch clause, because in the try-catch clause, you have an `await`. Agree? – Jonas Sourlier Jul 18 '22 at 09:58
  • @cheesus yea you're correct.. sorry about that.. just fixed it ;-; – The Bomb Squad Jul 18 '22 at 13:17
2

Elaborating on my comment, here's what I would bet is going on: You set up some sort base instance to interact with the API, then use that instance moving forward in your calls. That base instance is likely an event emitter that itself can emit an 'error' event, which is a fatal unhandled error with no 'error' listener setup.

I'll use postgres for an example since I'm unfamiliar with firebase or mongo.


// Pool is a pool of connections to the DB
const pool = new (require('pg')).Pool(...);

// Using pool we call an async function in a try catch
try {
  await pool.query('select foo from bar where id = $1', [92]);
}
catch(err) {
  // A SQL error like no table named bar would be caught here.
  // However a connection error would be emitted as an 'error'
  // event from pool itself, which would be unhandled
}

The solution in the example would be to start with

const pool = new (require('pg')).Pool(...);
pool.on('error', (err) => { /* do whatever with error */ })
leitning
  • 1,081
  • 1
  • 6
  • 10
  • Gonna check that, thanks. However, I'm afraid such an error would still be "global", meaning I cannot relate it to a specific task I'm running with the DB. For example, let's say I have 100 database requests running at the same time - if such an error event appears, how do I know which database request needs to be restarted? The relevant promise is neither rejected nor resolved, so my `Promise.all(...)` promise won't resolve. – Jonas Sourlier Mar 21 '22 at 09:11
  • You could make it singular by using a helper function that always adds and removes the listener, but the nature of that sort of error would presumably be global. If you can't connect to the database everything probably fails anyway right? – leitning Mar 21 '22 at 09:12
  • Not really. Let's say I have 50'000 database queries (or HTTP fetches) to do (yes the project has a lot of data). I bundle those 50'000 requests into one big promise which executes them, 100 at a time. This works most of the time, but sometimes, one of the promises runs into a "global error", for example a problem in the database client. In this case, I just need to re-create the client database connection pool and run the request again. Or, in case of a network error, just run the HTTP request again. But I cannot do that if the error is "global", because I don't know which request to re-run. – Jonas Sourlier Mar 21 '22 at 09:53