This question has a lot of useful answers on how to get a moving average. I have tried the two methods of numpy convolution and numpy cumsum and both worked fine on an example dataset, but produced a shorter array on my real data.
The data are spaced by 0.01
. The example dataset has a length of 50
, the real data tens of thousands. So it must be something about the window size that is causing the problem and I don't quite understand what is going on in the functions.
This is how I define the functions:
def smoothMAcum(depth,temp, scale): # Moving average by cumsum, scale = window size in m
dz = np.diff(depth)
N = int(scale/dz[0])
cumsum = np.cumsum(np.insert(temp, 0, 0))
smoothed=(cumsum[N:] - cumsum[:-N]) / N
return smoothed
def smoothMAconv(depth,temp, scale): # Moving average by numpy convolution
dz = np.diff(depth)
N = int(scale/dz[0])
smoothed=np.convolve(temp, np.ones((N,))/N, mode='valid')
return smoothed
Then I implement it:
scale = 5.
smooth = smoothMAconv(dep,data, scale)
but print len(dep), len(smooth)
returns 81071 80572
and the same happens if I use the other function. How can I get the smooth array of the same length as the data?
And why did it work on the small dataset? Even if I try different scales (and use the same for the example and for the data), the result in the example has the same length as the original data, but not in the real application.
I considered an effect of nan
values, but if I have a nan
in the example, it doesn't make a difference.
So where is the problem, if possible to tell without the full dataset?