I'm looking for an efficient quantiles algorithm that allows sample values to be "upserted" or replaced as the value changes over time.
Let's say I have values for items 1-n
. I'd like to put these into a quantiles algorithm that would efficiently store them. But then say at some point in the future, the value for item-i
gets incremented. I'd like to remove the original value for item-i
and replace it with the updated value. The specific use case is for a streaming system where the sample values are incrementing over time.
The closest I've seen to something like this is the t-Digest data structure. It stores sample values efficiently. The only thing it lacks is the ability to remove and replace a sample value.
I've also looked at Apache Quantiles Datasketch - it suffers from the same problem - no way to remove and replace a sample.
edit: thinking about this more, there wouldn't necessarily need to be a remove of the old value and an insertion of the incremented value. There might be a way to recalculate internal state more easily if there's a constraint that values can only be updated.