0

I am working on data visualization problem, where I am plotting daily active users on about 15k pages against time/dates in python. On some days, I have peaks on specific page, but those peaks are artificially created and affect cumulative results. I want to show overall trends, either by suppressing peaks or adjusting the data in some other way.

I am plotting using Pandas with Python, in jupyter notebook.

Question: Is there any efficient way to solve this problem?

Sample Graph is attached, Red line is original graph, where blue line is my attempt to suppress peaks. On x-axis, date is mentioned, on y-axis, sum of daily traffic is ploted

Chris
  • 1,618
  • 13
  • 21
  • Hi @Developer_, welcome to Stackoverflow! Without further information it's nigh on impossible to give a very good answer – really you should include a [minimal working example](https://stackoverflow.com/help/minimal-reproducible-example). That said, some form of moving average would probably be the most common way to deal with this. For example `df.rolling(window=5).mean()` would compute the rolling mean over a five day period. This will have the effect of smoothing out the curve – larger windows mean more smoothing and vice versa. – Chris Sep 07 '19 at 06:14
  • Also, check out this topic: https://stackoverflow.com/questions/27097015/plot-smooth-curves-of-pandas-series-data. It may help – Igor Belkov Sep 09 '19 at 09:56

0 Answers0