0

Hi I have a generator object. I want to count how many of each element there are in it. Without destroying the generator/changing (i want to use it again later).

Here is an example.

def create(n):
    items = ["a", "b", "c"]
    for i in range(n):
        yield items[random.randint(0,2)]

def countEach(gen):
    r = []
    for a in gen:
        add = True
        for i in range(len(r)):
            if a == r[i][0]:
                r[i][1] += 1
                add = False
        if add:
            r.append([a,0])
    return r

gen_list = create(100)
print (countEach(gen_list))
for b in gen_list:
    print (b)

output

[['b', 33345], ['c', 33298], ['a', 33354]]
[Finished in 0.6s]
Staked
  • 11
  • 4
  • Not possible without creating a permanent data structure, like a `list`. – iz_ Jan 23 '19 at 06:17
  • I noticed that when using yield code after is also executed. Is it possible to yield and return somehow? – Staked Jan 23 '19 at 06:20
  • You could also use [`itertools.tee`](https://docs.python.org/3/library/itertools.html#itertools.tee) to make `n` independent iterators from a single iterable. – Abdul Niyas P M Jan 23 '19 at 06:21
  • You could append to a list, then return it at the end, but you can't mix `return` and `yield`. But `list(create(100))` will do the job much easier. You can reuse this list as much as you want. – iz_ Jan 23 '19 at 06:21
  • @AbdulNiyasPM But the resulting iterators may yield different values. – iz_ Jan 23 '19 at 06:22
  • just a hack, create a deep copy `copy.deepcopy()` of the generator object and call the copied generator object every time you want to use it – sahasrara62 Jan 23 '19 at 06:22
  • @prashantrana That won't work. You will get `TypeError: can't pickle generator objects` – iz_ Jan 23 '19 at 06:26
  • Either a list (or similar) to store the data as already suggested several times OR an algorithm that processes it in one pass. – VPfB Jan 23 '19 at 07:21

2 Answers2

1

Unless I am fundamentally misunderstanding how Python generators work, this isn't possible and you should return in your create method rather than yield a generator.

def create(n):
    items = ["a", "b", "c"]
    return [items[random.randint(0,2)] for i in range(n)]

The above list comprehension will create a list rather than use a generator. For a better understanding of generators I'd suggest reading through this excellent post.

EDIT: Out of curiosity I timed the list(create(n)) method suggested by Tomothy32 against mine which returns a list. As expected, it's marginally slower to return the generator then store the list via comprehension (averaged 130 microseconds vs 125 microseconds). However you may prefer to leave the original method untouched and have the simple option of saving specific generator calls as lists rather than redefining it and always returning a list object.

NaT3z
  • 344
  • 4
  • 13
0

There is no need to change the create generator. Just do:

gen_list = list(create(100))

You can reuse this as much as you want.

iz_
  • 15,923
  • 3
  • 25
  • 40
  • I know i can make a list of it. But if i have a bigger list I will get a memory error. – Staked Jan 23 '19 at 06:46
  • @Staked In that case, I apologize, but it is impossible. See https://stackoverflow.com/questions/3345785/getting-number-of-elements-in-an-iterator-in-python – iz_ Jan 23 '19 at 06:49