i have below python code to compare 2 CSV file rows, and match each column field and display the difference. However the output is not in order, Please help to improve code output.
(I googled and found a python package csvdiff
but it requires to specify column number.)
2 CSV files:
cat file1.csv
1,2,2222,3333,4444,3,
cat file2.csv
1,2,5555,6666,7777,3,
My Python3 code:
with open('file1.csv', 'r') as t1, open('file2.csv', 'r') as t2:
filecoming = t1.readlines()
filevalidation = t2.readlines()
for i in range(0,len(filevalidation)):
coming_set = set(filecoming[i].replace("\n","").split(","))
validation_set = set(filevalidation[i].replace("\n","").split(","))
ReceivedDataList=list(validation_set.intersection(coming_set))
NotReceivedDataList=list(coming_set.union(validation_set)-
coming_set.intersection(validation_set))
print(NotReceivedDataList)
output:
['6666', '5555', '3333', '2222', '4444', '7777']
Even though it is printing the differences from both files, the output is not in order. (3 differences from file2, and 3 differences from file1)
i am trying the produce the column wise results i.e., with each difference in file1 to corresponding difference in file2.
somethinglike
2222 - 5555
3333 - 6666
4444 - 7777
Please help,,
Thanks in advance.