2

I am trying to understand Python's iterators in the context of the pysam module. By using the fetch method on a so called AlignmentFile class one get a proper iterator iter consisting of records from the file file. I can the use various methods to access each record (iterable), for instance the name with query_name:

import pysam
iter = pysam.AlignmentFile(file, "rb", check_sq=False).fetch(until_eof=True)
for record in iter:
  print(record.query_name)

It happens that records come in pairs so that one would like something like:

while True:
  r1 = iter.__next__() 
  r2 = iter.__next__()
  print(r1.query_name)     
  print(r2.query_name)

Calling next() is probably not the right way for million of records, but how can one use a for loop to consume the same iterator in pairs of iterables. I looked at the grouper recipe from itertools and the SOs Iterate an iterator by chunks (of n) in Python? [duplicate] (even a duplicate!) and What is the most “pythonic” way to iterate over a list in chunks? but cannot get it to work.

Jonas
  • 121,568
  • 97
  • 310
  • 388
user3375672
  • 3,728
  • 9
  • 41
  • 70
  • 2
    *"cannot get it to work"* - what precisely did you try, and what went wrong? Give a [mcve]. Note you should generally call `next(thing)`, not `thing.__next__()`. – jonrsharpe Apr 16 '17 at 21:07

1 Answers1

3

First of all, don't use the variable name iter, because that's already the name of a builtin function.

To answer your question, simply use itertools.izip (Python 2) or zip (Python 3) on the iterator.

Your code may look as simple as

for next_1, next_2 in zip(iterator, iterator):
    # stuff

edit: whoops, my original answer was the correct one all along, don't mind the itertools recipe.

edit 2: Consider itertools.izip_longest if you deal with iterators that could yield an uneven amount of objects:

>>> from itertools import izip_longest
>>> iterator = (x for x in (1,2,3))
>>> 
>>> for next_1, next_2 in izip_longest(iterator, iterator):
...     next_1, next_2
... 
(1, 2)
(3, None)
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • Yes so I have one iterator only that I want to go through adding every second iterable into r1 and r2. – user3375672 Apr 16 '17 at 21:13
  • 1
    @user3375672 there's nothing preventing you from providing the same argument twice to `zip`, i.e. `iterator_1 == iterator_2`. – timgeb Apr 16 '17 at 21:14