The data set consists of increasing values. Data is always appended, and thus is inherently sorted. There is no need for random insertions in the middle or beginning of the set, however, random deletions will occur.
Operations consist of:
- Finding the the nth value,
- Finding the position N for a given value
- Appending a sorted value at the end
- Random removal of a given value or index
Value iteration/locality is not at all important
for similar questions people proposed using a map of values-index as a solution, however in this case the indexes themselves must be sorted, e.g.
arr[N] < arr[N+x]
. The array+map solution does not work here.
The sequence itself needs to be stored on disk and I have a key-value database (lmdb) at my disposal. Having something which is serialized/deserialized in its entirety is also something to be avoided, as it is usually opened and closed fairly quickly, and deserializing the set in its entirety would potentially evict precious pages from elsewhere.
The set will contain ~100k items (and several thousand of these sets may be open simultaneously).
The closest in-memory structure is probably a skiplist, but I've yet to find an implementation that supports finding the nth element, and also unsure how it would be loaded/stored on disk