Python double iteration

Question

What is the pythonic way of iterating simultaneously over two lists?

Suppose I want to compare two files line by line (compare each ith line in one file with the ith line of the other file), I would want to do something like this:

file1 = csv.reader(open(filename1),...)
file2 = csv.reader(open(filename2),...)

for line1 in file1 and line2 in file2: #pseudo-code!
    if line1 != line2:
        print "files are not identical"
        break

What is the pythonic way of achieving this?

Edit: I am not using a file handler but rather a CSV reader (csv.reader(open(file),...)), and zip() doesn't seem to work with it...

Final edit: like @Alex M. suggested, zip() loads the files to memory on first iteration, so on big files this is an issue. On Python 2, using itertools solves the issue.

Possible duplicate of [How can I iterate through two lists in parallel in Python?](http://stackoverflow.com/questions/1663807/how-can-i-iterate-through-two-lists-in-parallel-in-python) — Ciro Santilli OurBigBook.com, Jan 13 '17 at 10:37

score 16 · Accepted Answer · answered Mar 06 '10 at 17:53

16

In Python 2, you should import itertools and use its izip:

with open(file1) as f1:
  with open(file2) as f2:
    for line1, line2 in itertools.izip(f1, f2):
      if line1 != line2:
        print 'files are different'
        break

with the built-in zip, both files will be entirely read into memory at once at the start of the loop, which may not be what you want. In Python 3, the built-in zip works like itertools.izip does in Python 2 -- incrementally.

answered Mar 06 '10 at 17:53

Alex Martelli

854,459
170
1,222
1,395

This does the job! Indeed the problem was that the files were pretty large and `zip()` was loading them all to memory... – Yuval Adam Mar 06 '10 at 17:57
Ah, maybe that's why I see no difference. I'm using Python 3.1. – kennytm Mar 06 '10 at 17:58
@KennyTM, yep, no "maybe": in Python 3 many things that used to rely on all-in-memory lists in Python 2, have become incremental and iterative. So it's important to always clarify whether questions and answers relate to Python 2 or Python 3 -- in Python 2 the (better;-) incremental and iterative approach is, so to speak, "opt-in" (you need to get it explicitly), in Python 3 it's intrinsic (you need to explicitly call `list` in the relative rare cases where you actually **do** want a list, all in memory at once;-). – Alex Martelli Mar 06 '10 at 18:05
Just a small note: you can, if you want, `open` both files within the same `with` statement: `with open(file1) as f1, open(file2) as f2`: – Daan Timmer Jan 10 '14 at 08:47

JAL · Answer 2 · 2010-03-06T18:28:59.110

11

I vote for using zip. The manual suggests "To loop over two or more sequences at the same time, the entries can be paired with the zip() function"

For example,

list_one = ['nachos', 'sandwich', 'name']
list_two = ['nachos', 'sandwich', 'the game']
for one, two in zip(list_one, list_two):
   if one != two:
      print "Difference found"

edited Mar 06 '10 at 18:28

answered Mar 06 '10 at 17:50

JAL

21,295
1
48
66

kennytm · Answer 3 · 2010-03-06T18:04:29.980

4

In lockstep (for Python ≥3):

for line1, line2 in zip(file1, file2):
   # etc.

As a "2D array":

for line1 in file1:
   for line2 in file2:
     # etc.
   # you may need to rewind file2 to the beginning.

edited Mar 06 '10 at 18:04

answered Mar 06 '10 at 17:44

kennytm

510,854
105
1,084
1,005

Thanks, I am looking for the lockstep method. Any idea why this method doesn't work for a `csv.reader()`? – Yuval Adam Mar 06 '10 at 17:49
maybe you should clarify that for the "2D array" one might need to reinitialise the inner iterator... – fortran Mar 06 '10 at 17:54
1

@Yuval, please edit your answer to show exactly how you're trying to use zip with a (one?!) csv.reader -- this comment is totally mysterious. – Alex Martelli Mar 06 '10 at 17:54

Python double iteration

3 Answers3

Linked

Related