Here's an example using the with
statement, supposing the files are not too big to fit in the memory
# Open 'new1.txt' as f1, 'new2.txt' as f2 and 'diff.txt' as outf
with open('new1.txt') as f1, open('new2.txt') as f2, open('diff.txt', 'w') as outf:
# Read the lines from 'new2.txt' and store them into a python set
lines = set(f2.readlines())
# Loop through each line in 'new1.txt'
for line in f1:
# If the line was not in 'new2.txt'
if line not in lines:
# Write the line to the output file
outf.write(line)
The with
statement simply closes the opened file(s) automatically. These two pieces of code are equal:
with open('temp.log') as temp:
temp.write('Temporary logging.')
# equal to:
temp = open('temp.log')
temp.write('Temporary logging.')
temp.close()
Yet an other way using two set
s, but this again isn't too memory effecient. If your files are big, this wont work:
# Again, open the three files as f1, f2 and outf
with open('new1.txt') as f1, open('new2.txt') as f2, open('diff.txt', 'w') as outf:
# Read the lines in 'new1.txt' and 'new2.txt'
s1, s2 = set(f1.readlines()), set(f2.readlines())
# `s1 - s2 | s2 - s2` returns the differences between two sets
# Now we simply loop through the different lines
for line in s1 - s2 | s2 - s1:
# And output all the different lines
outf.write(line)
Keep in mind, that this last code might not keep the order of your lines