0

I have a dataframe which has the following columns:

Date Zip Price
0 2019-01-01 90102 58.02
1 2019-01-01 90102 81.55
2 2019-01-01 90102 11.97
3 2019-01-01 90102 93.23
4 2019-01-01 90103 13.68

I want to create a 4th column which should have ratio of price based on the max price in that zip and on that date.

So I have used another df called df_max_price

df_max_price = df.groupby(['Date','Zip'], as_index=False)['Price'].max()
Date Zip Price
0 2019-01-01 90102 93.23
1 2019-01-01 90103 13.68

Now I want to have a new column in the df which shall be the ratio of Price and maxprice for that date and zip code

Date Zip Price Ratio
0 2019-01-01 90102 58.02 0.622
1 2019-01-01 90102 81.55 0.875
2 2019-01-01 90102 11.97 0.128
3 2019-01-01 90102 93.23 1.000
4 2019-01-01 90103 13.68 1.000

The calculation is based on 58.02/93.23 and so on.

Can someone help me showing how it can be done using lambda function.

It_is_Chris
  • 13,504
  • 2
  • 23
  • 41
Akhil Gupta
  • 43
  • 1
  • 8

1 Answers1

3

Use groupby and transform

df['ratio'] = df['Price'] / df.groupby(['Date','Zip'], as_index=False)['Price'].transform('max')['Price']

         Date    Zip  Price     ratio
0  2019-01-01  90102  58.02  0.622332
1  2019-01-01  90102  81.55  0.874718
2  2019-01-01  90102  11.97  0.128392
3  2019-01-01  90102  93.23  1.000000
4  2019-01-01  90103  13.68  1.000000
It_is_Chris
  • 13,504
  • 2
  • 23
  • 41
  • 1
    Shouldn't you use as string to np.max, Isnt that max slowing down the system? – adir abargil Jan 06 '21 at 19:59
  • Hi, @It_is_Chris, this works but gives a warning ``` /opt/conda/condapub/condamt_2020q3/u18/envs/py36ml_2020q3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead ``` – Akhil Gupta Jan 06 '21 at 20:03
  • 1
    @AkhilGupta that is not an issue with the answer but an issue with how you created the variable `df` You, at some point, created a slice of a larger pandas data frame. You should find that and add `.copy()` to the end. [Setting with copy warning](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – It_is_Chris Jan 06 '21 at 20:05
  • Thank you so much! :) – Akhil Gupta Jan 06 '21 at 20:07