1

If you have a generator or set of inputs that are not unique and that I can't just query or store in memory how can I best keep a running tally?

If memory is not an issue, creating a list than using count or using Counter as in this Question/Answer

However, without sufficient memory or no knowledge of the items, I think a dict makes most sense. Key being the values coming in and value being the count. Is there a better way though?

An example would be possibly be a non-equally weighted random generation of numbers. like an infinite sided die where there are a lot more of some numbers than other but we don't know which numbers those are.

kmfsousa
  • 183
  • 1
  • 1
  • 11

1 Answers1

1

A collections.Counter can work on an iterable which can give an iterator only returning (and holding in memory) the next item needed.

Example

from collections import Counter
from itertools import islice

import random


def producer():
    while True:
        yield int(random.normalvariate(300, 100))


data = Counter(islice(producer(), 400))

print(data)

The 400 can be replaced by a much larger value and it is only the memory needed to store each appearing value once with the count of appearance (like the dict you described).

Michael Butscher
  • 10,028
  • 4
  • 24
  • 25