Questions tagged [count-min-sketch]

The count-min sketch is a sketch (a probabilistic data structure summarizing a data stream) useful for finding frequent items in data streams.

14 questions
9
votes
2 answers

How to get top K elements from count-min-sketch?

I'm reading how the probabilistic data structure count-min-sketch is used in finding the top k elements in a data stream. But I cannot seem to wrap my head around the step where we maintain a heap to get our final answer. The problem: We have a…
6
votes
0 answers

How can i determine the width and depth of a count-min sketch?

The width (number of buckets) and depth (number of hash functions) of a Count-Min sketch determine the accuracy of the frequency estimation retrieved. From the 2005 paper of the original Count-Min authors: The parameters w and d can be chosen by…
5
votes
2 answers

Why are bloom filters not implemented like count-min sketch?

So I only recently learned about these, but from what I understood counting bloom filters are very similar to count-min sketches. The difference being that the former use a single array for all hash functions and the latter use an array per hash…
aa8y
  • 3,854
  • 4
  • 37
  • 62
5
votes
1 answer

What is a count-min sketch? When would you use it?

What is a count-min sketch? In what situations would it be useful?
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
4
votes
2 answers

Does the count-min sketch take less space than a typical sparse vector format?

The count-min sketch is a probabilistic data structure for lossy storage of counts in a multiset. It receives updates (i, c) where i is an element of a set and c is a non-negative quantity for that element, then does clever things with hash…
3
votes
2 answers

store top k results from count-min-sketch

I need to store top k most frequent elements in a stream. To estimate the frequency i use count-min-sketch algorithms. My stream is consist of keys(as string). So basically every time i encounter a new key in my stream, i calculate the frequency of…
BYsBro
  • 65
  • 1
  • 3
3
votes
1 answer

How does count min sketch find the most frequent item in a stream? - Heavy Hitters

Count min sketch uses different hash functions to map elements in the stream to the hash function. How to map back from the sketch to find the most frequent item? Considering that enough elements have been passes(millions) and we don’t know the…
user3508140
  • 285
  • 2
  • 18
3
votes
0 answers

Non-trivial usage of count-min sketch data-structure

I have a large array with increasing values - like this: array = [0, 1, 6, 6, 12, 13, 22, ..., 92939, 92940] and I want to use interpolation-search algorithm on it. Size of the array is variable, new elements are added to the end of the array. I…
Evgeny Lazin
  • 9,193
  • 6
  • 47
  • 83
1
vote
1 answer

Retrieve the average count in count-min-sketch datastructure

I am in love with probabilistic data structures. For my current problem, it seems that the count-min-sketch structure is almost the right candidate. I want to use count-min-sketch to store events per ID. Let's assume I do have the following…
Fritz
  • 872
  • 1
  • 8
  • 17
0
votes
1 answer

Use which hash functions for count-min sketch?

I am trying to implement count-min sketch with JavaScript. As I need some hash functions, I can not figure out which ones I can/should/must use. Any suggestions? If this question is stupid, I'd appreciate a hint. Thanks
AndreasInfo
  • 1,062
  • 11
  • 26
0
votes
1 answer

Count-Min Sketch and Heavy-Hitters problem

I am reading about Count-Min Sketch data structure which gives a probabilistic answer to point and range queries, based on error probability parameter and the tolerance parameter. For example, the question "how many times with probability of 10%…
user3508140
  • 285
  • 2
  • 18
0
votes
1 answer

Count Min Sketch: How to handle counters overflow?

So the whole point of Count-Min Sketch is to update certain counters depending on the results of the provided hash functions. But, these counters are limited in memory, and after running for quite some time, they might overflow, dropping from the…
shakedzy
  • 2,853
  • 5
  • 32
  • 62
0
votes
0 answers

What is max element can be add to a count min sketch, and how to use it

I studied count min sketch (CMS) by reading a paper: http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf However i have no experience on reading algorithm so there are some confusion. Can someone help to answer some question about count min…
Billy
  • 1
  • 1
0
votes
1 answer

Which Hash functions can be used in count-min sketch?

The number of elements in my set are over a billion 230. I intent to count the occurrence of each element in the set. For this purpose, I want to use count-min sketch. Please suggest how the hash functions should be chosen. The false positive rate…