12

I'm using Change Feed Processor library (or Azure Functions Cosmos DB trigger) to subscribe to collection updates. How do I set up multiple independent (non-competing) consumers to the feed of the same collection?

One way is to use multiple lease collections, e.g. leases1, leases2 etc. But that is a bit wasteful.

Is there a way to do that with just one lease collection? (e.g. by specifying a consumer group name somewhere, similar to Event Hubs Processor)

Mikhail Shilkov
  • 34,128
  • 3
  • 68
  • 107

2 Answers2

8

You can define a leaseCollectionPrefix for an Azure Function Cosmos DB Trigger. In the Azure portal just click on your function, then on Integrate, then on Advanced editor, which will open your function.json. There you can define the property on your trigger, e.g.

"bindings": [
    {
      "type": "cosmosDBTrigger",
      "name": "documents",
      "direction": "in",
      "leaseCollectionName": "leases",
      "connectionStringSetting": "myDatabase_DOCUMENTDB",
      "databaseName": "myDbName",
      "collectionName": "myCollectionName",
      "createLeaseCollectionIfNotExists": false,
      "leaseCollectionPrefix": "myFunctionSpecificValue"
    }

Additional settings are documented under Documentation:

The following settings customize the internal Change Feed mechanism and Lease collection usage, and can be set in the function.json in the Advanced Editor with the corresponding property names:

  • leaseCollectionPrefix : When set, it adds a prefix to the leases created in the Lease collection for this Function, effectively allowing two separate Azure Functions to share the same Lease collection by using different prefixes.
  • feedPollDelay : When set, it defines, in milliseconds, the delay in between polling a partition for new changes on the feed, after all current changes are drained. Default is 5000 (5 seconds).
  • leaseAcquireInterval : When set, it defines, in milliseconds, the interval to kick off a task to compute if partitions are distributed evenly among known host instances. Default is 13000 (13 seconds).
  • leaseExpirationInterval : When set, it defines, in milliseconds, the interval for which the lease is taken on a lease representing a partition. If the lease is not renewed within this interval, it will cause it to expire and ownership of the partition will move to another instance. Default is 60000 (60 seconds).
  • leaseRenewInterval : When set, it defines, in milliseconds, the renew interval for all leases for partitions currently held by an instance. Default is 17000 (17 seconds).
  • checkpointFrequency : When set, it defines, in milliseconds, the interval between lease checkpoints. Default is always after a successful Function call.
  • maxItemsPerInvocation : When set, it customizes the maximum amount of items received per Function call.
Chief Wiggum
  • 2,784
  • 2
  • 31
  • 44
  • Ah, cool, so they've made it. Could you extend your answer for direct usage of `ChangeFeedProcessor` (without Functions)? – Mikhail Shilkov Mar 27 '18 at 12:25
  • I'm sorry, I have no experience with the ChangeFeedProcessor. What I found was this [link](https://github.com/Azure/azure-documentdb-dotnet/blob/f5ae62aabf9cbe891cfc2f7d5da2f7941463a224/samples/ChangeFeedProcessor/DocumentDB.ChangeFeedProcessor/ChangeFeedEventHost.cs), which might help you. It's an example where they use a ChangeFeedProcessor and a LeasePrefix. – Chief Wiggum Mar 27 '18 at 12:37
  • 1
    Please how can I achieve if I have set continous integration for functions. There is no leaseCollectionPrefix proeprty available for cosmosDBTrigger attribute – marek_lani Aug 15 '18 at 15:48
  • @ChiefWiggum Questions 1. `leaseRenewInterval`: Suppose an instance could not renew its lease within 17s, will the lease be removed from that instance? Or feed will wait till `leaseExpirationInterval` to remove the lease from it and give it a chance to reacquire lease within 60s? 2. Will `leaseRenew` by default happens after `checkpoint`, or both are independent? i.e. `leaseRenew` can happen on separate thread, while other thread works on batch? 3. I have seen the error `failed to checkpoint for owner 'null' with continuation token`. How this can happen? Why owner can become null? – Vivek Vardhan Jul 05 '21 at 06:15
  • @Vivek Sorry, I don't know. Please ask a new question. – Chief Wiggum Jul 05 '21 at 06:20
2

I've noticed some inconsistencies between consuming the change feed through the Change Feed Processor Library directly vs through the Functions integration.

When using the Change Feed Processor Library, documents like this are generated:

{
    "id": "somegraph.documents.azure.com_obtRAA==_obtRAJvr8AU=..0",
    "_etag": "\"47006e54-0000-0000-0000-59d4fdf20000\"",
    "state": 2,
    "PartitionId": "0",
    "Owner": "CosmosChangeIngestionServiceType",
    "ContinuationToken": "\"143641\"",
    "SequenceNumber": 3322,
    "_rid": "obtRAIhO1RIFAAAAAAAAAA==",
    "_self": "dbs/obtRAA==/colls/obtRAIhO1RI=/docs/obtRAIhO1RIFAAAAAAAAAA==/",
    "_attachments": "attachments/",
    "_ts": 1507130866
}

Ones generated from Functions suspiciously omit the Owner property and set it to null. My understanding was that this Owner field differentiates the change feed consumer and would allow multiple consumers to track progress in the same Lease collection (which would obviously be ideal). So I'm not sure if it's a bug or something I missed when setting up the Function binding but it seems like currently you can only have one Function consumer per lease collection.

UPDATE:

Just had a weekly call with the Cosmos team and asked them this specific question as well as what the status of other lease storage providers such as Table Storage was. They're supposed to be getting back to us by end of day with some clarifications. I will update further when we get back the official information.

SliverNinja - MSFT
  • 31,051
  • 11
  • 110
  • 173
Jesse Carter
  • 20,062
  • 7
  • 64
  • 101
  • 2
    Jesse, that's correct, you can only have 1 Function per lease collection. You can scale the amount of Function instances though as each Function instance equals 1 Change Feed Processor Host. Now, having 2 Functions bound to the same lease collection (each Function with different code) won't work, only 1 Function will trigger. This is the same as creating 2 Hosts with different code tied to the same lease collection. – Matias Quaranta Oct 04 '17 at 17:21
  • Additionally, the Function Host will claim ownership of the leases and update the `Owner` property accordingly. Is your Functions starting? Do you see any errors on the Functions UI logs? – Matias Quaranta Oct 04 '17 at 17:24
  • @MS one function consumer per lease collection makes it hard to scale and not keep costs down - you end up paying per lease collection (*which only contains 25 documents for partitioned collections*) so the fact that you can't scale your design for handling multiple functions in one spot is a pain. Scale settings and cost are different for each lease collection/function you add (*min 400RU/s per function*). This needs fixed or costs will get ridiculous for using this design with multiple change feeds/functions. – SliverNinja - MSFT Dec 29 '17 at 15:07
  • [`ChangeFeedHostOptions` doesn't appear to be availabe yet](https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.changefeedprocessor.changefeedhostoptions?view=azure-dotnet). We [should be able to tie into the `LeasePrefix` in `function.json`](https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.changefeedprocessor.changefeedhostoptions.leaseprefix). [MS Documentation mentions this configuration](https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-documentdb#trigger---configuration), but no associated way to assign it – SliverNinja - MSFT Dec 29 '17 at 15:09
  • 2
    @DrewMarsh LeasePrefix support was released in Azure Functions to be able to share the lease collection among multiple Functions, it is described [here](https://medium.com/@Ealsur/azure-cosmos-db-functions-cookbook-multi-trigger-f8938673de57). – Matias Quaranta May 30 '18 at 09:48