I am in love with probabilistic data structures. For my current problem, it seems that the count-min-sketch structure is almost the right candidate. I want to use count-min-sketch to store events per ID.
Let's assume I do have the following
Map<String, Int> {
[ID1, 10],
[ID2, 12],
[ID2, 15]
}
If I use a count-min-sketch, I can query the data structure by IDs and retrieve the ~counts.
Question
Actually I am interested in the average occurrence over all IDs, which in the example above would be: 12,33. If I am using the count-min then it seems that I need to store the Set of IDs and then iterate over the set and query the count-min for each ID and calculate the average. Is there an improved way without storing all IDs? Ideally I just want to retrieve the average right away without remembering all IDs.
Hope that make sense!?