(Sorry in advance, I'm new to Stack and this is my first question)
I have two dataframes, one containing house prices for different properties, ppd_df:
price_paid deed_date postcode property_type norm_price
2 36250 2015-11-16 BA1 1JU F
3 48000 2015-11-25 BA2 0HB S
4 60000 2017-08-31 BA1 4NB F
... ... ... ... ... ...
8960 4025000 2015-07-16 BA1 2EU T
And, a lookup dataframe of price averages for each property type in each month, ave_df
:
D_price S_price T_price F_price price_date month_end
0 459471 285234 247582 208652 2015-01-01 2015-01-31
1 450617 279424 242798 205163 2015-02-01 2015-02-28
2 444885 275747 239328 202948 2015-03-01 2015-03-31
3 443513 274575 238553 201615 2015-04-01 2015-04-30
.........................................................
10 489997 303307 262281 218513 2015-11-01 2015-11-30
11 479240 297111 256468 213380 2015-12-01 2015-12-31
I want to normalise the prices in ppd_df
to make them more comparable by dividing each price_paid
by the average houseprice for that certain type of property in the month that it was bought and save this new value as norm_price
.
So norm_price
for a F
property with a deed_date
of 2015-02-16 would be:
norm_price
= price_paid
/ ( 205163)
I think I need to create a for
loop that goes through each row in the ppd_df
but am not sure how to go about it. I have tried using .itertuples
. .merge
and np.searchsorted
like in Pandas: select DF rows based on another DF. The answers here also seem similar but I can't get it to work for my problem.
Thanks!