Add new column to Pandas dataframe using conditional values from another column

Question

I would like to add a new column retailer_relationship, to my dataframe.

I would like each row value of this new column to be 'TRUE' if the retailer column value starts with any items within the list retailer_relationship, and 'FALSE' otherwise.

What I've tried:

list_of_relationships = ("retailer1","retailer2","retailer3")

for i in df.index:
    for relationship in list_of_relationships:            
        if df.iloc[i]['retailer'].str.startswith(relationship):
            df.at[i, 'retailer_relationship'] = "TRUE"
        else:
            df.at[i, 'retailer_relationship'] = "FALSE"

Possible duplicate of [Pandas conditional creation of a series/dataframe column](https://stackoverflow.com/questions/19913659/pandas-conditional-creation-of-a-series-dataframe-column) — cwalvoort, May 16 '19 at 02:38

gmds · Accepted Answer · 2019-05-16T02:51:10.717

2

You can use a regular expression combining the ^ special character, which specifies the beginning of the string, with another regex matching every element of retailer_relationship, since startswith does not accept regexes:

import re

regex = re.compile('^' + '|'.join(list_of_relationships))

df['retailer_relationship'] = df['retailer'].str.contains(regex).map({True: 'TRUE', False: 'FALSE'})

Since you want the literal strings 'TRUE' and 'FALSE', we can then use map to convert the booleans to strings.

An alternative method that is slightly faster, though I don't think that'll matter:

df['retailer_relationship'] = df['retailer'].str.contains(regex).transform(str).str.upper()

edited May 16 '19 at 02:51

answered May 16 '19 at 02:26

gmds

19,325
4
32
58

I'm getting: TypeError: can only join an iterable (on the regex line) – Deskjokey May 16 '19 at 02:44
@Deskjokey Did you run `retailer_relationship = ("retailer1","retailer2","retailer3")` before that? Actually, why do you call it `retailer_relationship` when you iterate through `list_of_relationships`? – gmds May 16 '19 at 02:44
It works. Just had to change: regex = re.compile('^' + '|'.join(list_of_relationships)) – Deskjokey May 16 '19 at 02:47
1

@Deskjokey Yup, I wrote my answer based on the original question. I suggest you edit it to change the reference to `list_of_relationships`. – gmds May 16 '19 at 02:48

score 0 · Answer 2 · answered May 16 '19 at 02:24

0

See if this works for you. It would help to share a sample of your df or a dummy data representing it.

df.loc['retailer_relationship'] = False
df.loc[df['retailer'].isin(retailer_relationship),'retailer_relationship'] = True

answered May 16 '19 at 02:24

Vasu Devan

176
6

score 0 · Answer 3 · answered May 16 '19 at 02:32

0

You still can using startswith in pandas

df['retailer_relationship'] = df['retailer'].str.startswith(tuple(retailer_relationship))

answered May 16 '19 at 02:32

BENY

317,841
20
164
234

Add new column to Pandas dataframe using conditional values from another column

3 Answers3