Changefeed Processor options are well described here -
I have few questions on that -
leaseRenewInterval
: Suppose an instance could not renew its lease within 17s (default lease renew interval), will the lease be removed from that instance? Or feed will wait tillleaseExpirationInterval
to remove the lease from it and give it a chance to reacquire lease within 60s?Will
leaseRenew
by default happens aftercheckpoint
, or both are independent? i.e. leaseRenew can happen on separate thread afterleaserenewinterval
, while other thread is still working on a batch?We have seen the error:
failed to checkpoint for owner 'null' with continuation token.
How this can happen? Why owner can becomenull
?We have also seen the exception
LeaseLostException
. Can this happen even if the pod/instance is not down? We are not expecting any load balance as only 1 physical partition is there, but want our system to be fault tolerant, so we do have multiple instances running where all other except 1, will always wait for lease to acquire.There are few instances where we can see, at the same time, 3 pods/instance having lease of same physical partition, or we can say, they acquired same lease. (We can have at max 1 Physical Partition, (TTL for document is 3 days and storage is less, so we are not expecting more than 1 physical partition)). How this can happen?
EDITS:
Current Settings:
leaseRenewInterval : 17s
leaseAcquireInterval: 13s
leaseExpirationInterval: 60s
feedPollDelay: 2s [only this is not the default]
ChangeFeed Processor version:
- We are using below in our maven
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-cosmos</artifactId>
<version>4.8.0</version>
</dependency>
So, I can assume the CFP version is 4.8.0