So I have a dataframe with all the boroughs in London with their average house price from the years 1995-2021.
What I am trying to do is compile a new dataframe that takes the most expensive borough for each year.
The column names for the original df are: [London_Borough, ID, Average_price, Year]
At first I figured I can loop by each year and create a temporary df where I assign each borough and its price for that particular year and from there I extract the max value for the average price.
For example:
for i in range(1995, 2022, 1):
temp = df[df['Year'] == i]
yr_max = temp['Average_price'].max()
the problem with this is that while I get the most expensive borough for that year, all I have is the number without the corresponding borough associated with it.
Is there any way I can extract the entire row? or at least just the borough and the price?
This honestly might just be a simple syntax problem but I have scoured over my notes and online resources but cannot find a way to locate a row given a particular value of one column.
The only solution I could think of is to first reset the index of temporary df, then create a list of average prices of that year, loop through the list until it matches the max price then use the index of that list to locate the index of the temporary df but that is not an acceptable solution as it is over complicated and does not abide by Ohm law as the course I am taking is for data science and so efficiency is principle.