Using GroupBy to group and filter data from excel

Question

Im trying to group rows of data in a .xlsx file(see image) by Animal and then get a list of Pet Names that meet one condition: Age(yrs) is greater than the mean Age(yrs) of Animal group. Current Code: import pandas as pd df = pd.read_excel('pets.xlsx')

above_avg = []
animal = df.groupby('Animal')
avg_age = animal.mean()

for index, row in df.iterrows():
    if row['Age(yrs)'] > avg:
        above_avg.append(row['Pet Name'])
print(above_avg)

Output (error):

if row['Age(yrs)'] > avg:
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

How can I correct this error in my code so that I get the desired output?:

['Dusty', 'Clifford', 'Shelly']

your question is answered below posts; https://stackoverflow.com/questions/21415661/logical-operators-for-boolean-indexing-in-pandas https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o — Lanre, Aug 04 '22 at 06:54

Rafiq Nazir · Accepted Answer · 2022-08-04T06:56:40.220

2

Your method is correct but you 're not using the proper column for calculating mean. This should fix the problem

above_avg = []
animal = df.groupby('Animal')
avg_age = animal['Age(yrs)'].mean()

for index, row in df.iterrows():
    if row['Age(yrs)'] > avg_age[row['Animal']]:
        above_avg.append(row['Pet Name'])
print(above_avg)

edited Aug 04 '22 at 06:56

answered Aug 04 '22 at 06:51

Rafiq Nazir

117
4

Using GroupBy to group and filter data from excel

1 Answers1