I have several tables that look like this:
ID YY ZZ
2 97 826
2 78 489
4 47 751
4 110 322
6 67 554
6 88 714
code:
raw = {'ID': [2, 2, 4, 4, 6, 6,],
'YY': [97,78,47,110,67,88],
'ZZ':[826,489,751,322,554,714]}
df = pd.DataFrame(raw)
For each of these dfs, I have to perform a number of operations.
- First, group by id,
- extract the length of the column zz and average of the column zz,
- put results in new df
New df that looks like this
Cities length mean
Paris 0 0
Madrid 0 0
Berlin 0 0
Warsaw 0 0
London 0 0
code:
raw2 = {'Cities': ['Paris', 'Madrid', 'Berlin', 'Warsaw', 'London'],
'length': 0,
'mean': 0}
df2 = pd.DataFrame(raw2)
I pulled out the average and the size of individual groups
df_grouped = df.groupby('ID').ZZ.size()
df_grouped2 = df.groupby('ID').ZZ.mean()
the problem occurs when trying to transfer results to a new table because it does not contain all the cities and the results must be matched according to the appropriate key.
I tried to use a dictionary:
dic_cities = {"Paris":df_grouped.loc[2],
"Madrid":df_grouped.loc[4],
"Warsaw":df_grouped.loc[6],
"Berlin":df_grouped.loc[8],
"London":df_grouped.loc[10]}
Unfortunately, I'm receiving KeyError: 8
I have 19 df's from which I have to extract this data and the final tables have to look like this:
Cities length mean
Paris 2 657.5
Madrid 2 536.5
Berlin 0 0.0
Warsaw 2 634.0
London 0 0.0
Does anyone know how to deal with it using groupby and the dictionary or knows a better way to do it?