using (or converting to code) strings for np.select

Question

I have the following code that works perfect:

df2 = pd.DataFrame({'TEXT': ['add', 'bede', 'agdd', 'bbbb', 'aaaa'],
                       'PRICE': [622, 200, 100, 459, 250]})
temp=df2['TEXT']
col         = 'TEXT'
conditions  = [ temp.str.contains('a'), temp.str.contains('b'), temp.str.contains('c') ]
choices     = [ "contains a", 'contains b', 'contains c' ]

df2["what_contains"] = np.select(conditions, choices, default=np.nan)

the thing is, the contents of conditions have to be read from a csv, which of course means that they will be strings. I have tried the following:

conditions=the_csv['cond'].apply(compile,filename='<string>',mode='eval')

but I get an error:invalid entry 0 in condlist: should be boolean ndarray

the .csv looks like this:

thanks!!

Are the conditions always of the form `.str.contains(...)` – modesitt Jun 12 '20 at 22:02 — modesitt, Jun 12 '20 at 22:02

modesitt · Accepted Answer · 2020-06-12T22:10:45.877

If your conditions are always of the form .str.contains, you can avoid using eval - which will be safer and more clear.

# get the conditions as a list of string conditions
given_conds = the_csv['cond'].tolist()
# get the string inside each condition
searching_for = [c.split("('")[-1].split("')")[0] for c in given_conds]
# form the real boolean conditions
conditions = [temp.str.contains(c) for c in searching_for]
# choices text
choices = [f"contains {c}" for c in searching_for]

df2["what_contains"] = np.select(conditions, choices, default=np.nan)

bm13563 · Answer 2 · 2020-06-12T21:58:33.763

0

You could use eval. Whether you should is up to you.

import pandas as pd
import numpy as np

df2 = pd.DataFrame({'TEXT': ['add', 'bede', 'agdd', 'bbbb', 'aaaa'],
                       'PRICE': [622, 200, 100, 459, 250]})
temp=df2['TEXT']
col         = 'TEXT'
conditions  = [ eval("temp.str.contains('a')"), eval("temp.str.contains('b')"), eval("temp.str.contains('c')") ]
choices     = [ "contains a", 'contains b', 'contains c' ]

df2["what_contains"] = np.select(conditions, choices, default=np.nan)

edited Jun 12 '20 at 21:58

answered Jun 12 '20 at 21:50

bm13563

688
5
18

thanks, but when I tried running it it throws "malformed node or string". I tried this: conditions = [ "temp.str.contains('a')", "temp.str.contains('b')", "temp.str.contains('c')"] conditions=map(literal_eval,conditions) – Gustavo Moreno Jun 12 '20 at 21:53
thanks, your code works, but do you know why I can't apply it to the whole series instead of one by one? When I tried: conditions=map(eval,conditions) it throws an error: name 'temp' is not defined – Gustavo Moreno Jun 12 '20 at 22:05

using (or converting to code) strings for np.select

2 Answers2