How to redistribute outliers over the previous time period?

Asked Dec 19 '21 at 08:03

Active Dec 19 '21 at 08:28

Viewed 34 times

Imagine a dataframe that looks like this:

Normally we would apply an algorithm from Detect and exclude outliers in a pandas DataFrame to entirely remove the 50, however my particular dataset instead requires me to distribute the values of the 50 over the previous 7 days:

How can I make this work in Pandas? I can detect the outliers pretty easily but not sure how to spread the values out into previous days. Note that a simple moving average doesn't work well for this type of data, as there would still be a jump in the average value when 50 shows up. What I need to do is smooth out 50 into the previous days so that no jump is visible.

edited Dec 19 '21 at 08:28

Mark Rotteveel

100,966
191
140
197

asked Dec 19 '21 at 08:03

JonathanReez

1,559
3
21
37

Why your input dataframe has a length of 11 and your output 10? Where is 1? Can you update your dataframe with a more complete input and output example, please? – Corralien Dec 19 '21 at 08:12
@Corralien you're right, updated – JonathanReez Dec 19 '21 at 08:17
How do you choose 15, 14, 13, 12, 11, 10, 9 and 8 (8 values not 7) – Corralien Dec 19 '21 at 08:28
@Corralien once I see 50, I want to add 50/7 to all previous days and set the current value to what it should've been if we assume that the values keep increasing/decreasing at the same rate. – JonathanReez Dec 19 '21 at 08:54
@Corralien basically imagine we have two data streams: one is measuring things on a daily basis and one dumps new datapoints at random periods of time as a big chunk. We want to acknowledge this new data but also avoid a jump in the graph. – JonathanReez Dec 19 '21 at 08:56

How to redistribute outliers over the previous time period?

0 Answers0