15

I have some data in python that is unixtime, value:

[(1301672429, 274), (1301672430, 302), (1301672431, 288)...]

Time constantly steps by one second. How might I reduce this data so the timestamp is every second, but the value is the average of the surrounding 10 values?

Fancier rolling averages would be good too, but this data is graphed so it is mostly to smooth out the graph.

Follow up of ( TSQL Rolling Average of Time Groupings after coming to the conclusion that trying to do this in SQL is a route of pain).

Community
  • 1
  • 1
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165

2 Answers2

19

Using http://www.scipy.org/Cookbook/SignalSmooth:

import numpy
def smooth(x,window_len=11,window='hanning'):
        if x.ndim != 1:
                raise ValueError, "smooth only accepts 1 dimension arrays."
        if x.size < window_len:
                raise ValueError, "Input vector needs to be bigger than window size."
        if window_len<3:
                return x
        if not window in ['flat', 'hanning', 'hamming', 'bartlett', 'blackman']:
                raise ValueError, "Window is on of 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'"
        s=numpy.r_[2*x[0]-x[window_len-1::-1],x,2*x[-1]-x[-1:-window_len:-1]]
        if window == 'flat': #moving average
                w=numpy.ones(window_len,'d')
        else:  
                w=eval('numpy.'+window+'(window_len)')
        y=numpy.convolve(w/w.sum(),s,mode='same')
        return y[window_len:-window_len+1]

I get what seems to be good results with (Not that I understand the math):

   if form_results['smooth']:
            a = numpy.array([x[1] for x in results])
            smoothed = smooth(a,window_len=21)
            results = zip([x[0] for x in results], smoothed)
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165
  • 2
    that seems reasonable. If you want the mean then your window should 'flat'. The other windowing protocols weight the data points in the window differently. – JoshAdel Apr 01 '11 at 17:00
  • 1
    If using python 3, make sure to change the lines with ValueErrors to: `raise ValueError("smooth only accepts 1 dimension arrays.")` – Femkemilene Jan 21 '19 at 10:06
1

I find Savitzky-Golay filter. it's assume a window and fit a polynomial curve and shift window. Fortunately it's implement in scipy.

https://en.wikipedia.org/wiki/File:Lissage_sg3_anim.gif

use this code:

from scipy.signal import savgol_filter
result = savgol_filter(value, 13, 5) # window size 13, polynomial order 5
moraei
  • 1,443
  • 1
  • 16
  • 15