-1

I have the following data frame df1

       string             lists
0      i have a dog       ['fox', 'dog', 'cat']
1      there is a cat     ['dog', 'house', 'car']
2      hello everyone     ['hi', 'hello', 'everyone']
3      hi my name is Joe  ['name', 'was', 'Joe']

I'm trying to return a data frame df2 that looks like this

       string             lists                         new_string
0      i have a dog       ['fox', 'dog', 'cat']         i have a
1      there is a cat     ['dog', 'house', 'car']       there is a cat
2      hello everyone     ['hi', 'hello', 'everyone']   
3      hi my name is Joe  ['name', 'was', 'Joe']        hi my is

I've referenced other questions such as https://stackoverflow.com/a/40493603/5879909, but I'm having trouble searching through a list in a column as opposed to a preset list.

Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
mjp
  • 99
  • 1
  • 12

1 Answers1

1

Considering that the dataframe is df, and that OP's goal is to create a new column named new_string whose values are strings equal to the one's in the string column without a string in the lists column, for that specific row, the following will do the work

df['new_string'] = df['string'].apply(lambda x: ' '.join([word for word in x.split() if word not in df['lists'][df['string'] == x].values[0]]))

[Out]:
              string                  lists      new_string
0       i have a dog        [fox, dog, cat]        i have a
1     there is a cat      [dog, house, car]  there is a cat
2     hello everyone  [hi, hello, everyone]                
3  hi my name is Joe       [name, was, Joe]        hi my is
Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
  • can this be amended to account for multi word strings, and also be case-insensitve? For example, if **string** is `I HAVE A DOG` and **lists** is `[fox, have a, cat]` and **new_string** equals `I DOG` – mjp Sep 19 '22 at 16:41
  • @mjp To make it case insensitve the adjusting the if to `if word.lower() not in df['lists'][df['string'] == x]` should do the work. – Gonçalo Peres Sep 19 '22 at 18:24
  • As for the multi-word strings you might want to clarify that. My recommendation is even to ask a different question and someone (or me) might be able to help. – Gonçalo Peres Sep 19 '22 at 18:24