1

I have looked at a bunch of similar posts on here, but none really answered my question:

Df:

pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [0,0,0,0,0,0,0,0]})

Answer needed:

new_pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [7.36,7.54,13.47,21.87,44.87,56,33,23]})

Trying to add column: Stored lists and variables used in conditionals:

## Pack type
four_pack = 'Case - 6x4 - 16oz - Can'
six_pack = 'Case - 4x6 - 12oz - Can'
four_pack2 = 'Case - 6x4 - 12oz - Can'

## Core Brands

core_brands = ['Prod 1','Prod 2', 'Prod 3',
               'Prod 4', 'Prod 5', 'Prod 6', 'Prod 7','Prod 8']

cali_brands = ['Prod 9', 'Prod 10']

Attempt 1 :

for product in pc_cogs['Product']:
    package = pc_cogs['Pack_type']
    category = pc_cogs['Keg Category']
    price = pc_cogs['Unit_sale_price']
    if product in core_brands & package == six_pack: 
        pc_cogs['Price Change'] = price + 2.36
    elif category == 'SEASONAL' & package == six_pack: 
        pc_cogs['Price Change'] = price - .46
    elif product in cali_brands & package == four_pack: 
        pc_cogs['Price Change'] = price + 3.47
    elif (category == 'SEASONAL') & (package == four_pack | package == four_pack2):
        pc_cogs['Price Change'] = price - .13      
    else: 
        pc_cogs['Price Change'] = 0

Error: operands could not be broadcast together with shapes (8,) (611,) Last elif has both conditionals enclosed. I also tried this with the other conditionals, but it didn't work.

Attempt 2:

pc_cogs['Price Change'][(pc_cogs['Product'] in core_brands) & (pc_cogs['Package'] == six_pack)] = pc_cogs['Unit_sale_price'] + 2.36`

Error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). ​ I saw the link for a popular post showing this error. It said to change from the words and/or to &| which I did.

I also saw this one: Creating Column in Dataframe Using Multiple Conditions.

But it didn't help.

Any help would be greatly appreciated

chasedcribbet
  • 268
  • 2
  • 14
  • Kindly show a sample dataframe of your starting input and then one with your desired output. Please see: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – David Erickson Sep 29 '20 at 01:54
  • 1
    Got it. Give me 20 minutes – chasedcribbet Sep 29 '20 at 02:12
  • The best part of dataframe is that you dont need to loop to update columns using a condition. you can just give the condition and if it is true , it will update itself. – Joe Ferndz Sep 29 '20 at 02:28
  • Ok. I have included two df templates for beginning and end. I check them in Jupyter to make sure they looked right. – chasedcribbet Sep 29 '20 at 02:29
  • I thought so, Joe. I have only been at it for 2 months and I constantly get caught inbetween the Python way and forget about the flexibility of the df's using pandas. – chasedcribbet Sep 29 '20 at 02:30
  • Please explain the logic of your calculation as well. – Henry Yik Sep 29 '20 at 02:47
  • you have product 1 thru 10. How are you addressing products 11 thru 100. Anything not in 1 thru 10 is going thru a different logic. Is that what you want? – Joe Ferndz Sep 29 '20 at 03:05
  • Instead of doing this `pc_cogs['Product'] in core_brands`, try `pc_cogs['Product'].isin( core_brands)` – Joe Ferndz Sep 29 '20 at 03:23
  • You may also want to look at this to check against a [list of values](https://stackoverflow.com/questions/18250298/how-to-check-if-a-value-is-in-the-list-in-selection-from-pandas-data-frame) – Joe Ferndz Sep 29 '20 at 03:28
  • All answers on here were great. I checked David's because I believe it will have more pedagogical benefit for people who are new to python. Quang's answer is correct and great for people who already know coding, IMO. But David's will help new learners understand the process and structure of np.select. – chasedcribbet Sep 30 '20 at 13:24

3 Answers3

2

This is an application of np.select:

pc_cogs['New Price'] = pc_cogs['Unit_Sale_Price'] + np.select([
        pc_cogs['Product'].isin(core_brands) & pc_cogs['Pack_type'].eq(six_pack),
        pc_cogs['Keg Category'].eq('SEASONAL') & pc_cogs['Pack_type'].eq(six_pack),
        pc_cogs['Product'].isin(cali_brands) & pc_cogs['Pack_type'].eq(four_pack),
        pc_cogs['Keg Category'].eq('SEASONAL') & pc_cogs['Pack_type'].isin([four_pack,four_pack2])
    ],
    [2.36,-.46,3.47,-.13],0
)

Output:

    Product      Pack_type                   Keg Category      Unit_Sale_Price    New Price
--  -----------  --------------------------  --------------  -----------------  -----------
 0  Product 1    Case - 4x6 - 12oz - Can     REGULAR                         5         7.36
 1  Product 95   Case - 4x6 - 12oz - Can     SEASONAL                        8         7.54
 2  Product 10   Case - 6x4 - 16oz - Can     WINTER                         10        13.47
 3  Product 44   Case - 6x4 - 12oz - Can     SEASONAL                       22        21.87
 4  Product 100  Case - 6x4 - 16oz - Can     SEASONAL                       45        44.87
 5  Product 69   Cask - Pin                  FALL                           56        56
 6  Product 78   Case - 12x - 22oz - Bottle  SEASONAL                       33        33
 7  Product 3    Case - 6x4 - 12oz - Can     WINTER                         23        23
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
2

I saw that Quang already posted an np.select() solution, but here is the full code. You had a typo in your core_brands and cali_brands lists spelling Prod instead of Product, so I just made the spelling match, so it got pulled in:

You can simply create conditions and results and use np.select():

import pandas as pd
import numpy as np
pc_cogs = pd.DataFrame({'Product': ['Product 1','Product 95','Product 10','Product 44','Product 100','Product 69','Product 78','Product 3'], 
              'Pack_type':['Case - 4x6 - 12oz - Can', 'Case - 4x6 - 12oz - Can', 'Case - 6x4 - 16oz - Can','Case - 6x4 - 12oz - Can',
                           'Case - 6x4 - 16oz - Can','Cask - Pin', 'Case - 12x - 22oz - Bottle','Case - 6x4 - 12oz - Can'],
             'Keg Category': ['REGULAR', 'SEASONAL', 'WINTER', 'SEASONAL', 'SEASONAL', 'FALL', 'SEASONAL', 'WINTER'],
             'Unit_Sale_Price': [5, 8, 10, 22, 45, 56, 33, 23],
             'New Price': [0,0,0,0,0,0,0,0]})

four_pack = 'Case - 6x4 - 16oz - Can'
six_pack = 'Case - 4x6 - 12oz - Can'
four_pack2 = 'Case - 6x4 - 12oz - Can'
core_brands = ['Product 1','Product 2', 'Product 3',
               'Product 4', 'Product 5', 'Product 6', 'Product 7','Product 8']
cali_brands = ['Product 9', 'Product 10']
price = pc_cogs['Unit_Sale_Price']

c1 = (pc_cogs['Product'].isin(core_brands)) & (pc_cogs['Pack_type'] == six_pack)
r1 = price + 2.36

c2 = (pc_cogs['Keg Category'] == 'SEASONAL') & (pc_cogs['Pack_type'] == six_pack)
r2 = price - .46

c3 = (pc_cogs['Product'].isin(cali_brands)) & (pc_cogs['Pack_type'] == four_pack)
r3 = price + 3.47

c4 = (pc_cogs['Keg Category'] == 'SEASONAL') & (pc_cogs['Pack_type'].isin([four_pack, four_pack2]))
r4 = price - .13

conditions = [c1,c2,c3,c4]
results = [r1,r2,r3,r4]
pc_cogs['New Price'] = np.select(conditions, results, pc_cogs['Unit_Sale_Price'])                                              
pc_cogs
Out[1]: 
       Product                   Pack_type Keg Category  Unit_Sale_Price  \
0    Product 1     Case - 4x6 - 12oz - Can      REGULAR                5   
1   Product 95     Case - 4x6 - 12oz - Can     SEASONAL                8   
2   Product 10     Case - 6x4 - 16oz - Can       WINTER               10   
3   Product 44     Case - 6x4 - 12oz - Can     SEASONAL               22   
4  Product 100     Case - 6x4 - 16oz - Can     SEASONAL               45   
5   Product 69                  Cask - Pin         FALL               56   
6   Product 78  Case - 12x - 22oz - Bottle     SEASONAL               33   
7    Product 3     Case - 6x4 - 12oz - Can       WINTER               23   

   New Price  
0       7.36  
1       7.54  
2      13.47  
3      21.87  
4      44.87  
5      56.00  
6      33.00  
7      23.00  
David Erickson
  • 16,433
  • 2
  • 19
  • 35
0

You can replace your for loop with the below 5 lines and you will get your result set.

pc_cogs['New Price'] = pc_cogs['Unit_Sale_Price']

pc_cogs.loc[(pc_cogs['Product'].isin(core_brands)) & (pc_cogs['Pack_type'] == six_pack), 'New Price'] += 2.36
pc_cogs.loc[(pc_cogs['Keg Category']=='SEASONAL') & (pc_cogs['Pack_type'] == six_pack), 'New Price'] -=  0.46
pc_cogs.loc[(pc_cogs['Product'].isin(cali_brands)) & (pc_cogs['Pack_type'] == four_pack), 'New Price'] += 3.47
pc_cogs.loc[(pc_cogs['Keg Category']=='SEASONAL') & (pc_cogs['Pack_type'].isin([four_pack,four_pack2])), 'New Price'] -= 0.13
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33