0

I have a timeseries dataset (time, single value) where in some places there are "bands" or sections of bad data that are clearly visible in the chart attached. I have tried all contextual heuristics to find times where they happen and exclude them that way, but there just isn't a contextual clue to filter them out through time.

This is different than any "outlier removal" task I've done. What would you do?

picture attached

  • Please visit [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – UseR10085 Jul 31 '20 at 06:15
  • Hi,try box-plot approach will be helpful to remove outlier.please share sample data with ```dput()```. – Tushar Lad Jul 31 '20 at 06:19
  • subset the df to unique(time), as it appears you have hours with full range of values [-1.0:0,05+]? – Chris Jul 31 '20 at 21:12
  • @Chris Thanks, they're actually not "full range", just really high-density erroneous data points with high SD for short time periods, i.e. "bands" – user2386454 Aug 03 '20 at 21:29
  • Thanks for other comments; unfortunately I can't share any of the data. – user2386454 Aug 03 '20 at 21:29
  • And from your understanding of the data, these aren't `blended distributions` before they got to you? But you suggest for yourself another way to filter based on the SD, something to think about anyway. For the purposes of `sharing data`, generating analogous data that could be shared is an interesting puzzle that might further explicate your problem. – Chris Aug 04 '20 at 16:51

0 Answers0