0

I have 2 data frames.

1st one has multiple strings, comma-delimited, per cell, 400k rows see this image 2nd data frame contains a single string per cell, 100k rows see the other image I want to check if the string in the 2nd data frame is in one of the cells of the 1st data frame and store the value in another column.

Absolutely this will take ages to run.

for i in range(100884):
    for j in range(397767):
       if df_old.iloc[i,3] in df_sfdc.iloc[j,0]:
           df_old.iloc[i,4]="True"
           print(i,j,"True")
       else:
           print(i,j,"False")
Jan
  • 42,290
  • 8
  • 54
  • 79
  • Please include the actual strings instead of linking to the images. – Jan Jun 08 '20 at 17:08
  • 5
    You'll probably want to check `.isin`: https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql. As for the multiple values per cell, you can look at https://stackoverflow.com/questions/62199414/using-pandas-to-filter-string-in-cell-with-multiple-values/62199563#62199563 to see how to `explode` them to a unique row per cell before you check the condition. (The second link you'll basically want to check `.isin` + `any(level=0)` after the explode) – ALollz Jun 08 '20 at 17:10

0 Answers0