How can we reject a window containing an outlier during rolling average using python?

Question

The problem that I am facing is how i can reject a window of 10 rows if one or many of the rows consist of an outlier while computing rolling average using python pandas?

For clarification:

df = df['speed'].rolling(10).mean() 
outlier_lower_bound = 0
outlier_upper_bound = 15

df.max()

Now how do I reject/ not consider the average value of those 10 period window if it consists an outlier?

The end goal is to get the max moving average without accounting/ considering the window of 10 period if it contains an outlier Thanks in advance!

Could you explain your idea more? Also, if there is any code involved, it's a good idea to post it. — Anwarvic, May 03 '20 at 12:46
You can apply [this technique of getting the rolling value then filtering](https://stackoverflow.com/questions/46964363/filtering-out-outliers-in-pandas-dataframe-with-rolling-median) but use mean rather than median — DarrylG, May 03 '20 at 12:49
For clarification: @Anwarvic df = df['speed'].rolling(10).mean() Now how do I reject/ not consider the average value of those 10 period window if it consists an outlier? The lower bound is 0 and the upper bound is 15 The end goal is to get the max moving average without accounting/ considering the window of 10 period if it contains an outlier Thanks in advance! — karan vir singh bajaj, May 03 '20 at 13:14

score 0 · Accepted Answer · answered May 03 '20 at 13:40

You can do fix your issue in just one line like so:

_filter = lambda x: float("inf") if x > outlier_upper_bound or x < outlier_lower_bound else x

df["speed"].apply(_filter).rolling(10).mean().dropna()

The idea behind my code can be understood in these steps:

I create a lambda function called _filter that converts any value outside your boundaries into inf.
When applying mean over a window that has inf in it, the result will be Nan.
Finally, I drop all Nan values which will mimic the same effect.

How can we reject a window containing an outlier during rolling average using python?

1 Answers1