I have two csv files - a master & an update file. I want take specific columns from the update file, & check the values against the master.
Both files will have the same columns & should look roughly like this:
Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANONT Soopakij,___,Executive Director,CEO,2000-04-14,
Basically, I want to traverse the update file, taking each stock code value from the update file & checking to see if it exists in the master file.
Then, for each matching stock code, I need to check for differences in the Director name value, keeping track of those that don't match.
I've followed this example but it doesn't seem to do quite what I need (or i don't fully understand it...): Python: Comparing two CSV files and searching for similar items
f1 = file(csvHKX, 'rU')
f2 = file(csvWRHK, 'rU')
f3 = file('results.csv', 'w')
csv1 = csv.reader(f1)
csv2 = csv.reader(f2)
csv3 = csv.writer(f3)
scode = [row for row in csv2]
for hkx_row in csv1:
for wrhk_row in scode:
if hkx_row[2] != wrhk_row[2]:
print 'HKX:', hkx_row
continue
f1.close()
f2.close()
f3.close()
The update file contains the following stock codes: '00121' & '01003' (for testing).
It seems like the code is iterating through the lists comparing each line & printing out a line if the stock codes don't match line for line. So when the first column is reading '00121' it's printing out lines containing '01003' & vice versa.
But I am only interested in when it can't find hkx_row[2] ANYWHERE in wrhk_row[2]