0

pets.xlsx

Im trying to group rows of data in a .xlsx file(see image) by Animal and then get a list of Pet Names that meet one condition: Age(yrs) is greater than the mean Age(yrs) of Animal group. Current Code: import pandas as pd df = pd.read_excel('pets.xlsx')

above_avg = []
animal = df.groupby('Animal')
avg_age = animal.mean()

for index, row in df.iterrows():
    if row['Age(yrs)'] > avg:
        above_avg.append(row['Pet Name'])
print(above_avg)

Output (error):

if row['Age(yrs)'] > avg:
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

How can I correct this error in my code so that I get the desired output?:

['Dusty', 'Clifford', 'Shelly']
  • your question is answered below posts; https://stackoverflow.com/questions/21415661/logical-operators-for-boolean-indexing-in-pandas https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o – Lanre Aug 04 '22 at 06:54

1 Answers1

2

Your method is correct but you 're not using the proper column for calculating mean. This should fix the problem

above_avg = []
animal = df.groupby('Animal')
avg_age = animal['Age(yrs)'].mean()

for index, row in df.iterrows():
    if row['Age(yrs)'] > avg_age[row['Animal']]:
        above_avg.append(row['Pet Name'])
print(above_avg)
Rafiq Nazir
  • 117
  • 4