-3

I am new to Python. I want to compare two files (1.txt and 2.txt).

Content of 1.txt:

a
b
c

Content of 2.txt:

a
b
c
d

the program code:

with open("1.txt") as f1:
    with open("2.txt") as f2:
        for line in f2.readlines():
            if line not in f1.readlines():
                print(line)

when I run the code, the output is:

b

c

d

In my opinion, it should only output the "d", which the letter exits in 2.txt, not in the 1.txt. So, anyone can tell why the output like in the picture?

Then I debug the program, and watches the two variables: "f1.readlines()" and "f2.readlines()" in the right corner.

I use the "Step Over" to the line 3, in the watches window, "f1.readlines()" and f2.readlines()" is still null, I can't figure it.

enter image description here

When i want to use "Step Over" to the line 4,the window becomes this: enter image description here

all the avriables are not available

So,my question is :

1 why my code can't work?

2 what is the right code to compare the "1.txt" and "2.txt"?

Thanks!

Bogdan Doicin
  • 2,342
  • 5
  • 25
  • 34
luoyu mo
  • 45
  • 6
  • 1
    It's probably an issue caused by re-evaluating the contents of `f1` for every line in `f2`. – StardustGogeta Jul 22 '19 at 16:09
  • Possible duplicate of [How can I open multiple files using "with open" in Python?](https://stackoverflow.com/questions/4617034/how-can-i-open-multiple-files-using-with-open-in-python) – pault Jul 22 '19 at 16:09
  • 1
    @pault - There isn't anything wrong with how the op is opening the files, re-reading `f1` every iteration is likely to be an issue though – Sayse Jul 22 '19 at 16:11
  • I can't see the images [blocked] but there's definitely something wrong with OP's indentation – pault Jul 22 '19 at 16:12
  • fixed indentation, it was correct in the (gently weeps) screenshots of the IDE – Useless Jul 22 '19 at 16:15

2 Answers2

5
  1. Once you've read all the lines in a file, there are no more lines in the file to read, pretty much by definition.

    So, after your first call to f1.readlines(), every subsequent call will return an empty list.

    You'll either need to seek back to the beginning, or save the result of readlines() (assuming both files will always fit in memory)

  2. the right code is likely to use difflib

Useless
  • 64,155
  • 6
  • 88
  • 132
1

for educational purposes, you could change your code to do this instead:

with open("2.txt") as f2:
    for line in f2.readlines():
        with open("1.txt") as f1:
            if line not in f1.readlines():
                print(line)

and it should do the right thing, for reasons explained by @Useless (who apparently isn't :)

note that you shouldn't normally do this, it'll take O(JK) operations where J and K are the number of lines in each file, while algorithms in difflib will be much more efficient. try your version on files with a 10k lines and it'll probably take a few minutes, while difflib should take a few milliseconds.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60