1

I'm sifting through some of my old bugs and while reviewing some nasty code I realized that my averaging or smoothing algorithm was pretty bad. I did a little research which led me to the "running mean" - makes sense, pretty straightforward. I was thinking through a possible implementation and realized that I don't know which collection would provide the type of "sliding" functionality that I need. In other words, I need to push/add an item to the end of the collection and then also pop/remove the first item from the collection. I think if I knew what this was called I could find the correct collection but I don't know what to search for.

Ideally a collection where you set the max size and anything added to it that exceeds that size would pop off the first item.

To illustrate, here is what I came up with while messing around:

using System;
using System.Collections.Generic;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            LinkedList<int> samples = new LinkedList<int>();

            //  Simulate packing the front of the samples, this would most like be a pre-averaged
            //  value from the raw samples
            for (int i = 0; i < 10; i++)
            {
                samples.AddLast(0);
            }

            for (int i = 0; i < 100; i++)
            {
                //  My attempt at a "sliding collection" - not really sure what to call it but as
                //  an item is added the first item is removed
                samples.RemoveFirst();
                samples.AddLast(i);

                foreach (int v in samples)
                {
                    Console.Write("{0:000} ", v);
                }

                Console.WriteLine(String.Empty);
            }

            Console.ReadLine();
        }
    }
}

As you can see I am manually handling the removal of the first item. I'm just asking if there is a standard collection that is optimized for this type of use?

scubasteve
  • 2,718
  • 4
  • 38
  • 49

4 Answers4

3

It appears that you're looking for a Circular Buffer. Here's a .NET implementation on CodePlex. You may also want to look at this question: How would you code an efficient Circular Buffer in Java or C#?

From the sample you've provided, it isn't clear how exactly this relates to an online-mean algorithm. If the only operation allowed on the buffer is to append; it should be trivial to cache and update the "total" inside the buffer (add the new value, subtract the removed one); making the maintaining of the mean an O(1) operation for every append. In this case, the buffer is effectively holding the Simple Moving Average (SMA) of a series.

Community
  • 1
  • 1
Ani
  • 111,048
  • 26
  • 262
  • 307
  • Yes, I was just reading about that and it made me miss pointers. MSDN states that Queue and Stack are both "circular arrays" but the problem seems to be you can't prevent Capacity from auto adjusting. – scubasteve Nov 20 '10 at 07:35
  • Your "Circular Buffer" answer has exposed many options! http://circularbuffer.codeplex.com/ discussed here: http://stackoverflow.com/questions/590069/how-would-you-code-an-efficient-circular-buffer-in-java-or-c Thanks! – scubasteve Nov 20 '10 at 07:38
  • @Steve K: Coincidentally, both of the links were already in my answer. :) – Ani Nov 20 '10 at 07:39
  • @Ani - that is strange because I first saw your answer and it was just your first sentence, I saw "Circular Buffer" and took off to Google :0) – scubasteve Nov 20 '10 at 07:52
  • Yeah, I had put it in during the edit window after my own googling. :) – Ani Nov 20 '10 at 07:54
  • @Ani - The sample was really just an attempt to create (What I now know is called a circular buffer). It's the subtraction of the removed element that made a cir. buffer seems like the best approach. Of course I can just keep track of the value n ago and subtract it. I haven't thought much about it yet, was just getting starting. Are you suggesting that the cir. buffer is overkill? – scubasteve Nov 20 '10 at 07:57
  • @Steve K: It does seem like a good solution, but it's hard for me to answer without knowledge of *all* the operations that could be performed on the collection. – Ani Nov 20 '10 at 08:04
0

Have you had a look at Queue Class

Adriaan Stander
  • 162,879
  • 31
  • 289
  • 284
0

Does a List satisfy your needs?

List<String> myList = new List<String>();

myList.Add("Something to the end");
myList.RemoveAt(0);
Rob
  • 10,004
  • 5
  • 61
  • 91
  • This will work, but have horrendous performance in a loop if the list is large. `List.RemoveAt(0)` is an `O(n)` operation. – Ani Nov 20 '10 at 07:50
0

@Ani - I'm creating a new Answer instead of comment because I have some code to paste. I took a swing at a dead simple object to assist with my running mean and came up with the following:

class RollingMean
{
    int _pos;
    int _count;
    double[] _buffer;

    public RollingMean(int size)
    {
        _buffer = new double[size];
        _pos = 0;
        _count = 0;
    }

    public RollingMean(int size, double initialValue) 
        : this(size)
    {
        //  Believe it or not there doesn't seem to be a better(performance) way...
        for (int i = 0; i < size; i++)
        {
            _buffer[i] = initialValue;
        }

        _count = size;
    }

    public double Push(double value)
    {
        _buffer[_pos] = value;

        _pos = (++_pos > _buffer.Length - 1) ? 0 : _pos;
        _count = Math.Min(++_count, _buffer.Length);

        return Mean;
    }

    public double Mean
    {
        get
        {
            return _buffer.Sum() / _count;
        }
    }
}

I'm reading 16 channels of data from a data acquisition system so I will just instantiate one of these for each channel and I think it will be cleaner than managing a multi-dimensional array or separate set of buffer/postition for each channel.

Here is sample usage for anyone interested:

static void Main(string[] args)
{
    RollingMean mean = new RollingMean(10, 7);

    mean.Push(3);
    mean.Push(4);
    mean.Push(5);
    mean.Push(6);
    mean.Push(7.125);

    Console.WriteLine( mean.Mean );
    Console.ReadLine();
}

I was going to make the RollingMean object a generic rather than lock into double but I couldn't find a generic constraint to limit the tpye numerical types. I moved on, gotta get back to work. Thanks for you help.

scubasteve
  • 2,718
  • 4
  • 38
  • 49