Set-up
I am scraping housing ads using Scrapy and subsequently analyse the data with pandas.
I use the pandas to compute the means and medians of several housing characteristics.
The dataframe df
looks like,
district | rent | rooms | …
----------------------------
North | 200 | 3 | …
South | 300 | 1 | …
South | 300 | 1 | …
⋮ ⋮ ⋮ ⋮
Problem
I would like to compute the average rent for a n-room apartment per district.
I found an answer here which brings me close, e.g.
df.loc[df['rooms'] == 1, 'rent'].mean()
but this computes the average rent for one-bedroom apartments for the whole city.
To do it per district, I'd like to do something like,
for d in district_set:
df.loc[df['rooms'] == 1 and df['district'] == d, 'rent'].mean()
where district_set
contains all possible districts.
Any suggestions?
I'd like to obtain the following table,
district | avg rent 1R | avg rent 2R | …
----------------------------------------
North | 200 | 400 | …
South | 300 | 500 | …
⋮ ⋮ ⋮