I have a dataframe that has a variety of properties on a dataset of buildings. These buildings are all assigned to a dwelling group (Apartment/ Semi detached house/ Detached house/ Terraced house) and a small area code. These buildings also have a year of construction column, however no unique identifier apart from their small area (circa 80 buildings).
I want to write a for loop that groups these buildings into their dwelling group, and then break them down into their small area and assigns them individually the median year of construction for that dwelling group in that small area. For example, divide up all apartments in small area 12345, and assign them individually (in a new column) the median year of construction for apartments in that small area.
So far geo_dwelling is a GeoDataFrame with columns;
In [20]: geo_dwelling.head(5)
Out[20]: cso_small_area Dublin Postcode Year of construction Year of construction range Dwelling type description Energy Rating ... height_ag height_bg floors_ag floors_bg category Dwelling Group 7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.02 0 3 0 R Apartment 7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.73 0 3 0 R Apartment 7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.56 0 3 0 R Apartment 7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.75 0 3 0 R Apartment 7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.85 0 3 0 R Apartment
geo_dwelling = geo_dropped[
geo_dropped["Dwelling Group"].str.contains("Apartment", na=False)]
geo_dwelling.groupby(["cso_small_area"])[["Year of construction"]].median()
Any help is much appreciated!