0

I have a df that I am trying to filter, using multiple conditions

remove_outliers[remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR) and remove_outliers['season'] =='Autumn']

when i try this i get the following error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-304-141eedb8a594> in <module>
----> 1 remove_outliers[remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR) and remove_outliers['season'] =='Autumn']

~\AppData\Roaming\Python\Python37\site-packages\pandas\core\generic.py in __nonzero__(self)
   1328     def __nonzero__(self):
   1329         raise ValueError(
-> 1330             f"The truth value of a {type(self).__name__} is ambiguous. "
   1331             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1332         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

what is the correct way to do this? appreciate any help or advice

Pythonuser
  • 203
  • 1
  • 11
  • Does this answer your question? [Filtering multiple conditions from a Dataframe in Python](https://stackoverflow.com/questions/40510820/filtering-multiple-conditions-from-a-dataframe-in-python) – Ajay A Oct 18 '20 at 12:25

2 Answers2

1
remove_outliers.loc[(remove_outliers['outlier_residual'] > (Q3 + 1.5 * IQR)) & (remove_outliers['season'] =='Autumn')]

And their is no need to nest .loc inside .loc

0

I guess you missing a pair of brackets. Let me know whether it works now:

remove_outliers.loc[(remove_outliers.loc[:,'outlier_residual'] > (Q3 + 1.5 * IQR)) & remove_outliers.loc[:,'season'] =='Autumn'),:]

P.S I have used .loc for good practice purpose

  • You took the good practice one step too far :) for the part inside outer `.loc[...]` you shouldn't be using `.loc` just to pull column. So instead the inner: `remove_outliers.loc[:,'outlier_residual']` use `remove_outliers['outlier_residual'] `. – Grzegorz Skibinski Oct 18 '20 at 13:32