I have a list with about 90k strings, and a Data Frame with several columns, I'm interested in checking whether a string of the list is in column_1 and if it is assign the same value at column_2.
I can do this:
for i in range(len(my_list)):
item = list[i]
for j in range(len(df)):
if item == df['column_1'][j]:
df['column_2'][j] = item
But I would prefer to avoid the nested loops
I tried this
for item in my list:
if item in list(df['column _1']):
position = df[df['column_1']==item]].index.values[0]
df['column_2'][position] = item
but I think that this solution is even slower and harder to read, can this operation be done with a simple list comprehension?
Edit.
Second solution it's considerable faster, about an order of magnitude. why is that? seems that in that case it has to search twice for the mach:
here:
if item in list(df['column _1'])
and here:
possition = df[df['column_1]=='tem]].index.values[0]
Still I would prefer a simpler solution.