1

I'm trying to add a column to the DF, depending on whether other column's value contains any of the strings in a list.

The list is:

services = [
        "TELECOM",
        "AYSA",
        "PERSONAL"
]

And so far I've tried:

payments["category"] = "services" if payments["concept"].contains(service for service in services) else ""

And this:

payments["category"] = payments["concept"].apply(lambda x: "services" if x.contains(service) for service in services) else ""

Among some other variations... I've seen other questions but they're mostly related to the opposite problem (checking whether a column's value is contained by a string in a list)

I could use your help! Thanks!!

Dijkie85
  • 1,036
  • 8
  • 21

2 Answers2

2

You can use np.where and str.contains:

payments['category'] = np.where(payments['concept'].str.contains('|'.join(services)),
                                'services', '')

Output:

        concept  category
0       TELECOM  services
1          AYSA  services
2      PERSONAL  services
3  other things          
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • Thanks, this worked great! But would be possible to skip rows where other category has already be defined? I'm trying to run a second line like `payments['category'] = np.where((payments['concept'].str.contains('|'.join(online_shopping))) & (payments["category"] == ''), 'online_shopping', '')` but that second condition isn't working... – Dijkie85 Jun 01 '20 at 01:44
  • 2
    In that case, you should use np.select which allows multiple conditions. – Quang Hoang Jun 01 '20 at 01:46
1

i think you can use isin

payments['category'] = np.where(
    payments['concept'].isin(services),
    'services', '')
import pandas
import numpy

dic = {"concept": ["TELECOM", "NULL"]}

payments = pandas.DataFrame.from_dict(dic)

payments["category"] = numpy.where(payments["concept"].isin(["TELECOM", "AYSA", "PERSONAL"]), "services", "")

print(payments)
D. Seah
  • 4,472
  • 1
  • 12
  • 20