2

I am attempting to write a specific value to a column in a df depending on if another column does or does not contain certain text. I have 3 outputs in the destination columns: Distributor, OEM and End User.

import pandas as pd
df = pd.read_excel("Customer Records.xlsx")

#Checking for distributor pricing tag
df.loc[df['CustomerRoles'].str.contains('Discount-Distributor-STD'), 'Customer Type'] = 'Distributor'
    
#Checking for OEM pricing tag
df.loc[df['CustomerRoles'].str.contains('Discount-OEM-STD'),         'Customer Type'] = 'OEM'

Those lines of code work as intended.

The last thing I need is for every other string that contains neither of those phrases to print "End User" in the 'Customer Type' column.

Mario
  • 1,631
  • 2
  • 21
  • 51

2 Answers2

3

You can use np.select:

conds = [
    df['CustomerRoles'].str.contains('Discount-Distributor-STD'),
    df['CustomerRoles'].str.contains('Discount-OEM-STD')
]
choices = ['Distributor', 'OEM']

df['Customer Type'] = np.select(condlist=conds, choicelist=choices, default='End User')
Corralien
  • 109,409
  • 8
  • 28
  • 52
1

| means "or", and ~ flips the boolean, so False -> True. You can combine these together:

df.loc[~(df['CustomerRoles'].str.contains('Discount-Distributor-STD')\
         | df['CustomerRoles'].str.contains('Discount-OEM-STD')),
       "Customer Type"] = "End User"

There is a very good explanation of the different logical operators here by @cs95.

Rawson
  • 2,637
  • 1
  • 5
  • 14