The following code works for file sizes upto 3 million records but greater than this I run out of memory because I am reading the data into lists and then using the lists to loop and find matches.
From previous postings, I have gathered that I should process each line at a time through the loop but can not find any postings on how to take a line at a time from a CSV file and process it through two iteration loops as in my code below.
Any help would be greatly appreciated. Thanking you in advance.
import csv
# open two csv files and read into lists lsts and lstl
with open('small.csv') as s:
sml = csv.reader(s)
lsts = [tuple(row) for row in sml]
with open('large.csv') as l:
lrg = csv.reader(l)
lstl = [tuple(row) for row in lrg] # can be two large for memory
# find a match and print
for rows in lsts:
for rowl in lstl:
if rowl[7] != rows[0]: # if no match continue
continue
else:
print(rowl[7], rowl[2]) # when matched print data required from large file