3

I'm implementing a memory system for an AI agent. It needs to have an internal list of state transitions which is capped at some number, say 10000.

If at capacity, adding a new memory should automatically remove the oldest memory.

Importantly, I should also need to be able to quickly access any item in this list.

A wrapper for Queue at first seemed obvious, but Queue does not allow fast access of any element. (O(n))

Similarly, remove an item from the beginning of a List structure takes O(n).

LinkedLists allow fast additions and removals, but again do not allow quick access to every index.

An array would allow random access but obviously it's not dynamically resizeable and deletion is problematic.

I've seen a HashMap being suggested but I'm ensure how that might be implemented.

Suggestions?

EmmetOT
  • 552
  • 1
  • 7
  • 23

4 Answers4

3

If you want the queue to be a fixed length, you could use a circular buffer which enables O(1) enqueue, dequeue and indexing operations and automatically overwrites old entries when the queue is full.

Lee
  • 142,018
  • 20
  • 234
  • 287
2

Try using a Dictionary with a LinkedList. The keys of the Dictionary are the indexes of the LinkedList nodes and the values of the Dictionary are of type LinkedListNode; that is, the LinkedList nodes.

The Dictionary would give you almost an O(1) on its operations and removing/adding LinkedListNode(s) to the beginning or end of a LinkedList is of O(1) as well.

Another alternative is to use a HashTable. However, in this case you have to know the capacity of the table beforehand (See Hashtable.Add Method) in order to get the O(1) performance:

If Count is less than the capacity of the Hashtable, this method is an O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.

In the first solution, no matter what's the capcity of the LinkedList or the Dictionary you would still get almost an O(1) from both the Dictionary and the LinkedList. Of course that's going to be an O(3) or O(4) depending on the total number of operations that you perform on both the Dictionary and the LinkedList to do an add or remove operation inside your memory class. The search access is going to be always an O(1) because you will be using the Dictionary only.

Timothy Ghanem
  • 1,606
  • 11
  • 20
  • I see what you mean, and this may be a stupid question, but what's the advantage of having a LinkedList instead of just using integers for keys and keeping a min and max key? Whenever you have to remove the oldest value, just remove the one at HashMap[min] and increment min. Whenever you add a value, increment max and add it to HashMap[max]. The range of valid keys is always [min, max]. – EmmetOT Jul 21 '16 at 09:31
0

EDIT: @Lee has it right. A circular buffer seems to give you what you want. Answer left in place though.

I think the data structure you want might be a priority queue -- it depends on what you mean by 'quickly access any item'. If you mean 'able to enumerate all items in O(N)', then a priority queue fits the bill. If you mean 'enumerate the list in historical order', then it won't.

Assuming you need these operations;

  • add an item and associate with a time
  • remove the oldest item
  • enumerate all existing items in arbitrary order

Then you could easily extend this priority queue implementation I wrote a little while ago.

  • You'll want implement IEnumerable as a loop through the T[] data array from 0 to cursor. This will give you your enumeration.
  • Implement a GetItem(i) function which returns this.data[i] so long as i <= cursor.
  • Implement an automatic size limit by putting this into the Push() method;

    if (queue.Size => 10000) { queue.Pop(); }

I think this is O(ln n) for push and pop, and O(N) to enumerate ALL items, or O(i) to find ANY item, so long as you don't need them in order.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Mindfire.DataStructures
{
    public class PiorityQueue<T>
    {
        private int[] priorities;
        private T[] data;
        private int cursor;
        private int capacity;

        public int Size
        {
            get
            {
                return cursor+1;
            }
        }

        public PiorityQueue(int capacity)
        {
            this.cursor = -1;
            this.capacity = capacity;
            this.priorities = new int[this.capacity];
            this.data = new T[this.capacity];
        }

        public T Pop()
        {
            if (this.Size == 0)
            {
                throw new InvalidOperationException($"The {this.GetType().Name} is Empty");
            }

            var result = this.data[0];

            this.data[0] = this.data[cursor];
            this.priorities[0] = this.priorities[cursor];
            this.cursor--;

            var loc = 0;
            while (true)
            {
                var l = loc * 2;
                var r = loc * 2 + 1;
                var leftIsBigger = l <= cursor && this.priorities[loc] < this.priorities[l];
                var rightIsBigger = r <= cursor && this.priorities[loc] < this.priorities[r];

                if (leftIsBigger)
                {
                    Swap(loc, l);
                    loc = l;
                }
                else if (rightIsBigger)
                {
                    Swap(loc, r);
                    loc = r;
                }
                else
                {
                    break;
                }
            }
            return result;
        }

        public void Push(int priority, T v)
        {
            this.cursor++;
            if (this.cursor == this.capacity)
            {
                Resize(this.capacity * 2);
            };
            this.data[this.cursor] = v;
            this.priorities[this.cursor] = priority;
            var loc = (this.cursor -1)/ 2;

            while (this.priorities[loc] < this.priorities[cursor])
            {
                // swap
                this.Swap(loc, cursor);
            }
        }

        private void Swap(int a, int b)
        {
            if (a == b) { return; }

            var data = this.data[b];
            var priority = this.priorities[b];
            this.data[b] = this.data[a];
            this.priorities[b] = this.priorities[a];
            this.priorities[a] = priority;
            this.data[a] = data;
        }

        private void Resize(int newCapacity)
        {
            var newPriorities = new int[newCapacity];
            var newData = new T[newCapacity];
            this.priorities.CopyTo(newPriorities, 0);
            this.data.CopyTo(newData, 0);
            this.data = newData;
            this.priorities = newPriorities;
            this.capacity = newCapacity;
        }

        public PiorityQueue() : this(1)
        {
        }

        public T Peek()
        {
            if (this.cursor > 0)
            {
                return this.data[0];
            }
            else
            {
                return default(T);
            }

        }

        public void Push(T item, int priority)
        {
        }
    }
}
Steve Cooper
  • 20,542
  • 15
  • 71
  • 88
0

HashMap is for Java, so the closest equivalent is Dictionary. C# Java HashMap equivalent. But I wouldn't say that this is the ultimate answer.

  • If you implement it as Dictionary, which key == the content, then you can search the content with O(1). However, you cannot have same key. Also, because it is not ordered, you may not know which the 1st content is.
  • If you implement it as Dictionary, which key == index, and value == the content, searching for the content still takes O(n) because you don't know the location of content.
  • A List or an Array will cost O(1) if you search the content by index reference. So, please double check your statement that it takes O(n)
  • If you search by index is sufficient, then circular array/ buffer which @Lee mentioned is good enough.
  • Otherwise, similar to DB, you might want to maintain in 2 separate data: 1 for storing the data (Circular Array) and the other one for search (Hash).
Community
  • 1
  • 1
kurakura88
  • 2,185
  • 2
  • 12
  • 18