applying different functions over a certain column in pandas

Question

I would like to use the apply and lambda methods in python in order to change the pricing in a column. The column name is Price. So, if the price is less than 20 I would like to pass and keep it the same. If 30>price>20 I would like to add 1. If the price is 40>price>30 then I would like to add 1.50. And so on. I am trying to figure out a way to apply these functions over a column and then send it back to an excel format in order to updating the pricing. I am confused as to how to do so. I have tried putting this operation in a function using an if clause but it is not spitting out the results that I would need to (k is the name of the dataframe):

def addition():
    if k[k['Price']] < 20]:
        pass
    if k[(k['Price']] > 20) & (k['Price] < 30)]:
       return k + 1
    if k[(k['Price']] > 30.01) & (k['Price] < 40)]:
       return k + 1.50

and so on. However, at the end, when I attempt to send out (what I thought was the newly updated k[k['Price] format in xlsx it doesn't even show up. I have tried to make the xlsx variable global as well but still no luck. I think it is simpler to use the lambda function, but I am having trouble deciding on how to separate and update the prices in that column based off the conditions. Much help would be appreciated.

This is the dataframe that I am trying to perform the different functions on:

   0        23.198824
   1        21.080706
   2        15.810118
   3        21.787059
   4        18.821882
                  ...    
   33525    20.347059
   33526    25.665882
   33527    33.077647
   33528    21.803529
   33529    23.043529
   Name: Price, Length: 33530, dtype: float64

Can you provide a snippet of your dataframe alongside with this function you want to use in order to accomplish your programming goal? — Calcium Owl, Oct 07 '22 at 14:17
Do you want also for condition 50>price>40 add 2.0, for 60>price>50 add 2.5 etc? — Calcium Owl, Oct 07 '22 at 14:36
Tell me the difference if you don't think this is a dup. I will reopen this later. — Ynjxsjmh, Oct 07 '22 at 14:44
@Ynjxsjmh I received an error message when I tried to use it as a function KeyError: "None of [Float64Index([23.19882352941176, 21.08070588235294, 15.81011764705882,\n 21.78705882352941, 18.82188235294118, 22.49294117647059,\n — jumpman23, Oct 07 '22 at 14:52
@CalciumOwl Your method worked without error and yes I would like to keep adding. Do I just continue to put it at the end? Thank you much appreciation — jumpman23, Oct 07 '22 at 14:56
@jumpman23 Use what as a function? You can check the `apply` answer in the dup link. But best is to use `np.select`. — Ynjxsjmh, Oct 07 '22 at 15:05
@CalciumOwl I have tried to do the 50>price)40 add 2.0 and so forth but it has gave an error message — jumpman23, Oct 07 '22 at 15:43
If you want to achieve such a step-wise approach to all price values, than you can write: `df = df['Price'].apply(lambda x: x + 0.5 * int(str(x).split('.')[0][:-1]) if (len(str(x).split('.')[0]) > 1) & (int(str(x).split('.')[0][:-1]) != 1) else x)`. — Calcium Owl, Oct 07 '22 at 18:14

score 1 · Answer 1 · answered Oct 07 '22 at 14:30

1

If k is the dataframe,then k+1 won't work, it will cause an error. You can write a function to change the price and apply it to the column -

def update_price(price):
     if 20<price<30:
         price += 1
     elif 30<price<40:
         price += 1.5
     return price

df['Updated_Price'] = df['Price'].apply(lambda x: update_price(x))

In [39]: df
Out[39]: 
  Name  Price
0    a     15
1    b     23
2    c     37

In [43]: df
Out[43]: 
  Name  Price  Updated_Price
0    a     15           15.0
1    b     23           24.0
2    c     37           38.5

answered Oct 07 '22 at 14:30

purpin

136
7

1

You can avoid using `lambda` if you create stand-alone function: `df['Price'].apply(update_price)` – Boris Silantev Oct 07 '22 at 14:46
@purpin I receive a Key error method when I use your method that is too long. KeyError: "None of [Float64Index([23.19882352941176, 21.08070588235294, 15.81011764705882,\n 21.78705882352941, 18.82188235294118, 22.49294117647059,\n – jumpman23 Oct 07 '22 at 14:51
@jumpman23 Check whether you are applying the function to a DataFrame object. If "df" is a DataFrame with a column named "Price" then this code will work. – purpin Oct 07 '22 at 14:54
@purpin I also wanted to say thank you for the response and help. Yes, it is a DataFrame object because the original file was in XLSX and I imported it into jupyter notebook via the pd.read_excel() method – jumpman23 Oct 07 '22 at 15:07
@purpin I even tried dfd=pd.DataFrame(k['Price']) to make sure it was in a DataFrame and then I tried to apply the function to it but it is giving a long error message that was shown above – jumpman23 Oct 07 '22 at 16:12
1

@BorisSilantev Thanks it finally worked! – jumpman23 Oct 07 '22 at 16:54
@purpin Thank you! I finally got it to work! – jumpman23 Oct 07 '22 at 16:55

Calcium Owl · Answer 2 · 2022-10-07T14:35:56.183

0

You can use apply method and lambda for this purpose alongside with nested if..elses.

import pandas as pd

df = pd.DataFrame({
    'Price': [10.0, 23.0, 50.0, 32.0, 12.0, 50.0]
})

df = df['Price'].apply(lambda x: x if x < 20.0 else (x + 1.0 if 30.0 > x > 20.0 else x + 1.5))

print(df)

Output:

0    10.0
1    24.0
2    51.5
3    33.5
4    12.0
5    51.5
Name: Price, dtype: float64

edited Oct 07 '22 at 14:35

answered Oct 07 '22 at 14:29

Calcium Owl

67
7

applying different functions over a certain column in pandas

2 Answers2