0

I am a python beginner and was facing a issue with iterating over a grouped data more than once. I understand that once consumed an iterator can't be re-used but is it possible to get multiple iterators from single groupby()?

This answer says that multiple iterators can be created over lists etc. But i don't understand how I can do the same for groupby?

Multiple Iterators

What I am trying to do is as follows:

  1. I have data that are (key, value) pairs and I want to groupby key.
  2. There is some special kind of data based on the value part in each group and I want to extract these special pairs and process them separately.
  3. After I am done I need to go back to the original data and process the remaining pairs (this is where I need the second iterator).

If you need to see my code here is the basic layout of what I am doing but I dunno if it is really required:

for current_vertex, group in groupby(data, itemgetter(0)):
    try:
        # Special data extraction
        matching = [int(value.rstrip().split(':')[0]) for key, value in group if CURRENT_NODE_IDENTIFIER in value]
        if len(matching) != 0:
            # Do something with the data extracted (some variables generated here -- say x, y z)
            for key, value in group:
                if not CURRENT_NODE_IDENTIFIER in value:
                    # Do something with remaining key, value pairs (use x, y, z)
usamazf
  • 3,195
  • 4
  • 22
  • 40
  • 2
    Convert each group from an iterator to a iterable by applying `list` to them. – Dan D. Apr 06 '18 at 09:29
  • @DanD. Hey mate thanks for your comment, but I don't really understand how I can go about doing so? Can you elaborate a bit? Thanks! – usamazf Apr 06 '18 at 09:34

1 Answers1

0

In case anyone is wondering the same, I resolved the problem by duplicating the iterator as described here:

How to duplicate an Iterator?

Since the group itself is an iterator all I had to do was duplicate it as:

# To duplicate an iterator given the iterator group
group, duplicate_iterator = tee(group)

Don't forget to import tee function from itertools. I don't know if this is best way possible but at least it works and get the job done.

Wolf
  • 9,679
  • 7
  • 62
  • 108
usamazf
  • 3,195
  • 4
  • 22
  • 40