0

I have a long list of reward signals (-1 for loss, 0 for tie, and +1 for win). I want to average these signals in "windows" and then smooth this resulting curve to show progress. How do I do this with matplotlib/scipy?

My codes like:

#!/usr/bin/env python
import matplotlib
matplotlib.rcParams['backend'] = "Qt4Agg"

import matplotlib.pyplot as plt
import numpy as np

y = np.array([-1, 1, 0, -1, -1, -1, 1, 1, 1, 1, 0, 0, 0, 1, 1, -1, 1, 1, -1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, -1, 1, 1, 0, 1, 1, 0, 1, -1, -1, 1, -1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, -1, 0, 1, 1, 1, -1, 1, 1, 1, 1, 0, -1, 0, 1, 0, 1, 1, 1, -1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, -1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1]
)
x = np.array(xrange(len(y)))

plt.plot(x,y)
plt.show()

I tried solutions from similar questions, like this, which recommending using a spline, but when applied to my data, that consumes all my memory and crashes my machine.

Community
  • 1
  • 1
Cerin
  • 60,957
  • 96
  • 316
  • 522

2 Answers2

3

At some point I found this somewhere. I am having trouble finding the source, but I use it for convolving 1d ndarrays with various windows, and should solve your problem.

def smooth(x,window_len=11,window='hanning'):

    if x.ndim != 1:
        raise ValueError, "smooth only accepts 1 dimension arrays."
    if x.size < window_len:
        raise ValueError, "Input vector needs to be bigger than window size."
    if window_len<3:
        return x
    if not window in ['flat', 'hanning', 'hamming', 'bartlett', 'blackman']:
        raise ValueError, "Window is on of 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'"

    s=numpy.r_[x[window_len-1:0:-1],x,x[-1:-window_len:-1]]

    if window == 'flat': #moving average
        w=numpy.ones(window_len,'d')
    else:
        w=eval('numpy.'+window+'(window_len)')

    y=numpy.convolve(w/w.sum(),s,mode='valid')
    return y

So for example, with your data you'd just do:

plt.plot(smooth(y))
plt.show()

And you get: smoothed

derricw
  • 6,757
  • 3
  • 30
  • 34
1

The answer you linked recommends using scipy.interpolate.spline which constructs the b-spline representation using full matrices. This is why it consumes this much memory. If smoothing splines is what you're after, at the moment you're better off using scipy.interpolate.UnivariateSpline, it should have saner memory footprint.

If you need some window averages/convolutions, check out numpy.convolve and/or convolution/window functionality in scipy.signal.

ev-br
  • 24,968
  • 9
  • 65
  • 78