Python: Fill dataframe column depending on other column

Question

My pandas dataframe:

ID	String	Pet
1	this is a cat
2	hello dog

I would like to extract the pet from the 'String' column and fill the 'Pet' column accordingly. The third row should be empty, and not filled by default.

My attempt:

df['Pet'] = np.where(df['String'].str.contains("cat"), "cat",
            np.where(df['String'].str.contains("dog"), "dog", '0'))

Unfortunately the empty (third) row also gets filled in my attempt.

Thank you in advance for your help!

So need change `'0'` to `''` ? – jezrael Oct 07 '22 at 09:17 — jezrael, Oct 07 '22 at 09:17

score 1 · Answer 1 · answered Oct 07 '22 at 09:19

One approach is to, first, create a list with the strings that one see as pet, such as

pets = ['cat', 'dog', 'bird']

Then, using Python pandas.Series.str.extract and regular expressions (using re) one is able to do the work

import re
    
df['Pet'] = df['String'].str.extract(f'({"|".join(pets)})', flags=re.IGNORECASE, expand=False)

[Out]:

   ID         String  Pet
0   1  this is a cat  cat
1   2      hello dog  dog

Note:

flags=re.IGNORECASE makes this approach case insensitive.

score 1 · Answer 2 · answered Oct 07 '22 at 09:19

Looks like you could use a regex with str.extract and fillna for your default value:

animals = ['cat', 'dog']
regex = '|'.join(animals)

df['Pet'] = df['String'].str.extract(f'(?i)({regex})', expand=False).fillna(0)

output:

   ID           String  Pet
0    1  this is a cat   cat
1    2      hello dog   dog

Python: Fill dataframe column depending on other column

2 Answers2