0

I have a categorization problem. The categorizing rule is:

If

  1. Storage Condition == 'refrigerate' and
  2. 100 < profit Per Unit < 150 and
  3. Inventory Qty <20

is given, restock Action = 'Hold Current stock level'

else restock Action = 'On Sale'.

Here's the dataset I need to run the rules on:

ID,      Fruit, Stroage Condition, Profit Per Unit, In Season or Not, Inventory Qty, Restock Action
1,       Apple,  room temperature,              20,              Yes,           200,
2,      Banana,  room temperature,              65,              Yes,            30,
3,        Pear,       refrigerate,              60,              Yes,           180,
4,  Strawberry,       refrigerate,             185,               No,            70,
5,  Watermelon,  room temperature,               8,               No,            90,
6,       Mango,             Other,              20,               No,           100,
7, DragonFruit,             Other,              65,               No,           105,

Code I have tried:

for i in range(len(df['ID'])):
    if df['Storage Condition'][i] == 'refrigerate' and df['Profit Per Unit'][i] >100 and df['Profit Per Unit'][i] <150 and df['Inventory Qty'][i]  <20:
        df['restock action'] = 'Hold Current stock level'

But I got this error message:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can any one please help?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Xiaowan Wen
  • 135
  • 1
  • 3
  • 12

2 Answers2

2

Use np.where:

c1=df['Stroage Condition'].eq('refrigerate')
c2=df['Profit Per Unit'].between(100,150)
c3=df['Inventory Qty']<20
df['Restock Action']=np.where(c1&c2&c3,'Hold Current stock level','On Sale')
print(df)

   ID        Fruit Stroage Condition  Profit Per Unit   In Season or Not  Inventory Qty  \
0   1        Apple  room temperature               20                Yes           200   
1   2       Banana  room temperature               65                Yes            30   
2   3         Pear       refrigerate               60                Yes           180   
3   4   Strawberry       refrigerate              185                 No            70   
4   5   Watermelon  room temperature                8                 No            90   
5   6        Mango             Other               20                 No           100   
6   7  DragonFruit             Other               65                 No           105   

  Restock Action  
0        On Sale  
1        On Sale  
2        On Sale  
3        On Sale  
4        On Sale  
5        On Sale  
6        On Sale  

In this case, no row verifies the 3 conditions, so for all rows the result is On Sale

ansev
  • 30,322
  • 5
  • 17
  • 31
  • Thank you! this is very helpful. Actually, I have many multi-condition rules (i.60+) that i need to apply and i just realized that once I set-up all the conditions, I can easily mix them and generate all the rules!!! Thank you very much! – Xiaowan Wen Oct 08 '19 at 04:54
  • I am glad to help you! https://stackoverflow.com/help/someone-answers – ansev Oct 08 '19 at 09:23
  • Hi ansev, an additiona question, if i apply multiple rules, and i more than two outcomes (i.e. hold, on sale, increase inventory, etc.); how can i keep previous results while adding new 'restock action'? Also, if i have multiple conditions all lead to the same conditions, is there a way to do it with np.where? for example, df['Restock Action']=np.where(c4&c5 or c6&c7,'Increase Inventory',''); i make up the c4 c5 c6 and c7 here, but in reality, i receive a traceback again: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). – Xiaowan Wen Oct 08 '19 at 22:28
  • Thanks you for accpet and upvote!. I think you need to see look at numpy.select. But you could do a new question... I will glad to help you! – ansev Oct 09 '19 at 16:53
  • ansev, problem solved now. i used numpy.select. Thank you! – Xiaowan Wen Oct 10 '19 at 17:34
0

If you are not concerned with performance there is a convenience method called apply that can be used. It can take your function and apply it to either rows or columns in your dataframe.

It's good to know how it works and the downsides of using it if you plan on learning more about the pandas library.

When should I ever want to use pandas apply() in my code?

def func(df):
    if df['Stroage Condition'] == 'refrigerate' and 100 < df['Profit Per Unit'] < 150 and df['Inventory Qty'] < 20:
        return 'Hold Current stock level'
    else:
        return 'On Sale'

df['Restock Action'] = df.apply(func, axis='columns')
Michael Gardner
  • 1,693
  • 1
  • 11
  • 13
  • Hi Michael, a follow up question, when i use df['Stroage Condition'] == 'refrigerate' in the if statement of the function, i receive a traceback of ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0') but if i use row.Storage Condition, it works. Any insights why? – Xiaowan Wen Oct 08 '19 at 23:18
  • One more question, is there a way that i can turn the judgement sentence into string and use them in the function? for example: c1 = df['Stroage Condition'] == 'refrigerate'; and then wrote the function like: def func(df): if c1 and c2 and c3 return 'hold curernt stock level' ? – Xiaowan Wen Oct 08 '19 at 23:40