I am trying to compare two csv files and find the rows that are different using python 2.7. The rows are considered different when all columns are not the same. The files will be the same format with all the same columns and will be in this format.
oldfile.csv
ID name Date Amount
1 John 6/16/2015 $3000
2 Adam 6/16/2015 $4000
newfile.csv
ID name Date Amount
1 John 6/16/2015 $3000
2 Adam 6/16/2015 $4000
3 Sam 6/17/2015 $5000
4 Dan 6/17/2015 $6000
When I run my script i want the output to be just the bottom two lines and written in a csv file unfortunately I simply cant get my code to work properly. What I have written below prints out the contents of the oldfile.csv and it does not print the different rows. what i want the code to do is print out the last to lines in a output.csv file. i.e.
output.csv
3 Sam 6/17/2015 $5000
4 Dan 6/17/2015 $6000
Here is my code python 2.7 code using the csv module.
import csv
f1 = open ("olddata/olddata.csv")
oldFile1 = csv.reader(f1)
oldList1 = []
for row in oldFile1:
oldList1.append(row)
f2 = open ("newdata/newdata.csv")
newFile2 = csv.reader(f2)
newList2 = []
for row in newFile2:
newList2.append(row)
f1.close()
f2.close()
output = [row for row in oldList1 if row not in newList2]
print output
unfortunately the code only prints out the content of oldfile.csv. I have been working on it all day and trying different variations but I simply can not get it to work correctly. Again, your help would be greatly appreciated.