In reliable collections (specifically IReliableDictionary), an approach for implementing 'common' queries is to update a secondary dictionary which structures the keys to be ordered a specific way in an enumeration. For large data sets, I would like to avoid shuttling a large amount of data around.
To achieve this I would like to implement some sort of continuation token which the caller can supply to me when requesting the data. I am currently implementing this by first generating an ordered enumeration and returning the first n items where n = the MAX_PAGE size. The continuation is essentially the last key in that list of n items. The next time the caller passes in the continuation token, I generate the ordered enumerable with the filter function specifying that the key should be greater than the continuation.
This has 2 problems (that I can see):
- The collection could change between when the caller first requests a page and a subsequent request. This, I'm not certain I can avoid since updates to the collection need to be able to occur at any time regardless of who is attempting to page through the data.
- I'm not certain how the filter function is used. I would assume that since a developer could filter on anything, the GetEnumerableAsync() method must supply all keys in the dictionary before returning the enumerable. For a sufficiently large data set, this seems slow.
Are there any prescribed approaches for paging data like this? I am beginning to feel like I might be barking up the wrong tree with Reliable Collections for some of my use cases.