I have a file which looks like below, however you should note that in reality the file contains more then 100.000 records.
blue black red 250
red black blue 140
black yellow purple 100
orange blue blue 140
blue black red 250
red black blue 140
black yellow purple 700
orange blue blue 200
I also have a list which contains the following values my_list = ['140', '700', '800']
Now I want the following:
- If one of the values of
my_list
occurs in the filerow[3]
I want to append the whole record to a new list. - If one of the values of my list does not occur in the file row[3] I want to append the value itself and the rest of the values should be
'unknown'
.
This is my code:
new_list = []
with open(my_file, 'r') as input:
reader = csv.reader(input, delimiter = '\t')
row3_list = []
for row in reader:
row3_list.append(row[3])
for my_number in my_list :
if my_number in row3_list :
new_list.append(row)
elif my_number not in row3_list :
new_list.append(['Unknown', 'Unkown', 'Unkown', row[3]])
This is my desired output:
red black blue 140
orange blue blue 140
red black blue 140
black yellow purple 700
unknown unkown unkown 800
My problem: Like I mentioned my file contains a bulk of records could be more then 100.000+. So above way is taking ages. I have been waiting for output for about 15 minutes now but still nothing.