0

let's say I have a given dataframe with many labels and I would like to calculate mean for each label(mean for age/weight for each label). Is there a simpler way to do this?

import numpy as np
data = [['tom', 10,20],['tom', 12,30], ['nick', 15,40],['nick', 12,50], ['juli', 14,35],['juli', 16,38]]
df = pd.DataFrame(data, columns=['Name', 'Age','Weight'])
list_of_uniqe_values = []
mean=[]
for el in df.Name.unique():
    list_of_uniqe_values.append(el)

for el in list_of_uniqe_values:
    innerlist = []
    for col in range(1, len(df.columns)):
        innerlist.append(np.mean(df.iloc[:,col].where(df['Name']==el)))
    mean.append(innerlist)
print(mean)
Julian
  • 23
  • 4
  • just do like in the duplicate but use `.mean()`: `df.groupby('Name').mean()` (there's likely another duplicate for mean) – mozway Sep 28 '22 at 06:54
  • What if I would like to do more advanced calculations? Like Shapiro-Wilk test. The groupby in pandas doesn't have this function – Julian Sep 28 '22 at 07:12
  • You should read the documentation before asking questions. This is doable with `apply`. – mozway Sep 28 '22 at 07:24

0 Answers0