I found this while searching around. The answer by Martijn Pieters is what I'm looking to change slightly. Would just comment on his answer.. but don't have any reputation.
Just to put it this is his code:
import csv
with open('masterlist.csv', 'rb') as master:
master_indices = dict((r[1], i) for i, r in enumerate(csv.reader(master)))
with open('hosts.csv', 'rb') as hosts:
with open('results.csv', 'wb') as results:
reader = csv.reader(hosts)
writer = csv.writer(results)
writer.writerow(next(reader, []) + ['RESULTS'])
for row in reader:
index = master_indices.get(row[3])
if index is not None:
message = 'FOUND in master list (row {})'.format(index)
else:
message = 'NOT FOUND in master list'
writer.writerow(row + [message])
Lets say I have a masterlist.csv that looks like:
ID, Name, Date
1234,John Smith,01/01/2020
1235,Jane Smith,01/02/2020
1236,Bob Smith,01/02/2020
1236,Bob Smith,01/05/2020
if you were to print out master_indicies you would get (adjusting the code to use the first row and not the second):
{'1234': 1, '1235': 2, '1236': 4}
The exact code above pretty much does exactly what I need to, except that it will only add Bob Smith's ID to the 'master_indicies' dictionary once even though its in there twice. Essentially, what do I need to change in the 'master_indicies' code to add each line to the dictionary regardless of how many times it is in the csv file? So I get:
{'1234': 1, '1235': 2, '1236': 3,'1236': 4}
Any help is much appreciated! Thanks!