3

Hi I have a list and a pandas dataframe whose elements are lists as well. I want to find out if any one of elements of pandas column list are present in the other list and create one column with 1 if found and 0 if not found and another column with found elements as string separated by ,. I found a similar question but couldn`t understand how could I use it for the case here. Check if one or more elements of a list are present in Pandas column. Thank you very much! :)

letters = ['a', 'b', 'c', 'f', 'j']
df_temp = pd.DataFrame({'letters_list' : [['a','b','c'], [ 'd','e','f'], ['g','h','i'], ['j','h','i']]})

enter image description here

How can I create a new column found which is 1 if any letter in list letters is found in letters_list, and another column letters_found which outputs letters matched in the list as string separated by ,? It would like like following.

enter image description here

  • You can use conditional assignment as explained here: https://stackoverflow.com/questions/28896769/vectorize-conditional-assignment-in-pandas-dataframe – RedProgrammer Feb 17 '22 at 13:27

1 Answers1

2

You need to use a loop here.

Make letters a set for efficient testing of common elements with set.intersection and use a list comprehension. Then check if you found any letter by making "letters_found" as boolean (empty string becomes False, the rest True) and converting to int to have 0/1.

letters = set(['a', 'b', 'c', 'f', 'j'])

df_temp['letters_found'] = [','.join(sorted(letters.intersection(l))) 
                            for l in df_temp['letters_list']]
df_temp['found'] = df_temp['letters_found'].astype(bool).astype(int)

output:

  letters_list letters_found  found
0    [a, b, c]         a,b,c      1
1    [d, e, f]             f      1
2    [g, h, i]                    0
3    [j, h, i]             j      1
mozway
  • 194,879
  • 13
  • 39
  • 75