1

My question is around the cosmosDB trigger for functions. We are exploring the best way to trigger our functions. Our original idea was to trigger it by pushing messages into a service bus and have the functions instantiate from a service bus trigger. We know that When we trigger a function via Service bus or queue, if the function execution fails for any reason, the messages goes back into the queue after expiration of the lock period. This is suits our usecase but a premium service bus is fairly expensive (600$ pm).

I was wondering what happens when we have a CosmosDB trigger instead? In this case, if the function fails (lets say during an unhandled exception), is the trigger lost or is there some way to manage a re-trigger? How can we manage retries and failure scenarios?

Anupam Chand
  • 2,209
  • 1
  • 5
  • 14

3 Answers3

2

The Azure Functions trigger for Cosmos DB uses the Change Feed Processor Library internally underneath. The docs about its error handling state:

To prevent your change feed processor from getting "stuck" continuously retrying the same batch of changes, you should add logic in your delegate code to write documents, upon exception, to a dead-letter queue. This design ensures that you can keep track of unprocessed changes while still being able to continue to process future changes. The dead-letter queue might be another Cosmos container. The exact data store does not matter, simply that the unprocessed changes are persisted.

The takeaway is that the Change Feed itself doesn't keep any state about processing errors -- that's up to you as the consumer. A better pattern is to pair this with a queue like Service Bus which provides dead-lettering for reliable error handling. One pattern is to have a dedicated function to read the Cosmos change feed (using the Cosmos trigger) which simply inserts interesting events into a queue to trigger other functions to process reliably.

Kashyap
  • 15,354
  • 13
  • 64
  • 103
Noah Stahl
  • 6,905
  • 5
  • 25
  • 36
  • "Cosmos DB trigger in Azure Functions uses the Change Feed Processor" can you quote the source? – Kashyap Feb 21 '21 at 04:15
  • 1
    "The Azure Functions trigger for Cosmos DB uses the Change Feed Processor Library internally" - https://learn.microsoft.com/en-us/azure/cosmos-db/how-to-configure-cosmos-db-trigger – Noah Stahl Feb 21 '21 at 13:48
  • The documentation says the error handling is different though. See Matias' answer: https://stackoverflow.com/a/66320240/151025 – Zenuka Mar 15 '23 at 14:05
2

While the Change Feed Trigger does use the Change Feed Processor underneath, the error handling is different.

Change Feed Processor, on an unhandled exception, would retry the same batch of documents again.

Change Feed Trigger on the other hand (similar to other Triggers like Event Hub) continues on unhandled exceptions (reference https://learn.microsoft.com/en-us/azure/cosmos-db/troubleshoot-changefeed-functions#some-changes-are-missing-in-my-trigger). Ideally from the code perspective, make sure to have try/catch blocks as recommended on the reference to handle any failure scenario.

Normally, failed documents can be send to a queue or a logging sink to analyze later. If sent to a deadletter queue, you could have a QueueTrigger that retries or any other mechanism depending on the nature of the failure.

Matias Quaranta
  • 13,907
  • 1
  • 22
  • 47
  • Updated link: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/troubleshoot-changefeed-functions#some-changes-are-missing-in-your-trigger – Zenuka Mar 15 '23 at 14:06
0

I would think that as a standard it wont retry but there is currently a preview feature for retry polices that can be used with all triggers and languages.

e.g for all languages except C# add this to function.json

    "retry": {
        "strategy": "fixedDelay",
        "maxRetryCount": 3,
        "delayInterval": "00:00:05"
    }

And for C# add the following attribute

[FixedDelayRetry(3, "00:00:05")]

For more information of configurations: https://learn.microsoft.com/sv-se/azure/azure-functions/functions-bindings-error-pages?tabs=csharp#retry-policies-preview

Bjorne
  • 1,424
  • 7
  • 13
  • The retry policy support in the runtime for triggers other than Timer, Kafka, and Event Hubs is being removed after this feature becomes generally available (GA). Preview retry policy support for all triggers other than Timer and Event Hubs will be removed in December 2022. For more information, see the [Retries section](https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-error-pages?tabs=fixed-delay%2Cin-process&pivots=programming-language-csharp#retries) below. – Jonas Nov 23 '22 at 16:28