5

I need to store a collection of nodes:

class Node
{
   int Value;
   //other info
}

I have three requirements:

  1. Need to be able to efficiently retrieve the node with the lowest Value in the collection
  2. Need to be able to efficiently insert a node into the collection
  3. Two nodes can have the same Value

I thought the best collection to use for this would be some sort of sorted list. That way requirement #1 is satisfied efficiently by just taking the first element from the sorted list. Requirement #2 is satisfied efficiently by inserting a new node in the right place in the list.

But the SortedList collection in .Net is like SortedDictionary and requires the key being sorted on to be unique, which violates requirement #3.

There appears to be no collection in .Net that satisfies these requirements, mainly because the self-sorting collections that do exist require keys being sorted on to be unique. What is the reason for this? I assume it cannot be an oversight. What am I not grasping here? I can find similar questions about this but they usually involve someone suggesting SortList, followed by realizing this doesn't work, and then the conversation fades out without a standard solution. At least if someone would say "There is no collection in C# for this task, you need to hack something together" that would be an answer.

Is it acceptable to use a regular List<Node> and re-sort the list whenever a new node is added? Seems like that wouldn't be as efficient as inserting the node in the right place to begin with. Perhaps that is what I should do? Manually iterate over the list until I find the place to insert a new node myself?

Kasyx
  • 3,170
  • 21
  • 32
Weyland Yutani
  • 4,682
  • 1
  • 22
  • 28
  • How many distinct values are there? how often are you inserting? how often are you reading? how often are reads going to be value based - and will that always be lowest value or will it be value x? these are some of the things i would think about to make this decision. if your not sure why not do a first pass with a normal list and use linq to select items from it? –  May 13 '13 at 11:13
  • Your solution sounds good. If there is nothing like that in .NET - coding it manually would not be a big issue. You can always have a quick look at open source projects if something like this is already availalbe somewhere. I never needed it so I don't know. And if it is not there - why not making it yourself? :) Given linq skip while functionality it should take just few lines of code – Jarek May 13 '13 at 11:13
  • 2
    Have you look at [C5 Collections](http://www.itu.dk/research/c5/)? There's probably something there. – Binary Worrier May 13 '13 at 11:21
  • There will be a lot of insertions and always pulling off the lowest value node. Basically I guess I want a kind of queue. Except new nodes need to be inserted into the right place in the queue according to their Value. – Weyland Yutani May 13 '13 at 11:21
  • 4
    ***If*** you end up using a `List` that you keep sorted yourself, be sure to use the `BinarySearch` method of `List<>` to find the index where you need to insert a new item. Don't "iterate manually" if that means not using `BinarySearch`. – Jeppe Stig Nielsen May 13 '13 at 11:22
  • "Have you look at C5 Collections? There's probably something there", thanks I'll take a look – Weyland Yutani May 13 '13 at 11:24
  • 1
    From the requirements it sounds like you need a Heap, rather than a sorted list. Adding an item to a heap is only O(log n), rather than O(n) for a sorted list or even O(n log n) for adding and then re-sorting. – harold May 13 '13 at 11:27
  • 1
    @harold But if he starts with an empty `List`, and he always adds to the list through the following method: `public void Add(Node n) { var idx = innerList.BinarySearch(n); if (idx >= 0) { innerList.Insert(idx, n); else { innerList.Insert(~idx, n); } }`, then the `List` will always be sorted automatically, and he won't ever have to call `Sort`. Of course, `Node` should be `IComparable`, or the `BinarySearch` method should be given an `IComparer` as its second argument (one can be created with `Comparer.Create` static method of .NET 4.5). – Jeppe Stig Nielsen May 13 '13 at 11:48
  • 1
    @JeppeStigNielsen yes that's not too bad and falls in the O(n) case I mentioned, because the insertion might move a lot of things. Heaps don't have that problem. – harold May 13 '13 at 12:40
  • @JeppeStigNielsen I did not know about the complement of index that BinarySearch sends when no element is found. That is a brilliant implementation in C#, I'd say! Thanks. – Narayana Jan 06 '15 at 07:37
  • @Narayana Yes. Just remember that if the `List<>` is not sorted anymore (for some reason) then everything becomes corrupt. The output from `BinaraySearch` is unpredictable and meaningless in that case. – Jeppe Stig Nielsen Jan 06 '15 at 09:09

5 Answers5

2

If all you need is to efficiently insert, and quickly retrieve the item with the lowest value, then you don't need a sorted list. You need a heap. Check out A Generic Binary Heap Class.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • 1
    I've found that 4-ary heaps, despite a seemingly slower RemoveMin, are nevertheless faster than binary heaps in practice, even for RemoveMin. For adding items, they're significantly faster. [this is not a criticism, just something I thought you might find interesting] – harold May 13 '13 at 15:26
  • @harold: Thanks for the info. I've been meaning to experiment with other heap types. I like the binary heap because it's simple to implement, and quite effective. I'll have to look into the 4-ary heap. – Jim Mischel May 13 '13 at 16:28
1

Make your list_key unique by adding the object id or another unique identifier: IDs 4 and 5, both having value "1" will become "1_4" and "1_5", which can be added to the sorted List without trouble and will be sorted as expected.

dognose
  • 20,360
  • 9
  • 61
  • 107
1

You could use a SortedList<int, List<NodeInfo>>, where you'll put the Value in the key and all the other properties in the value:

public class NodeList : SortedList<int, List<NodeInfo>>
{
    public void Add(int key, NodeInfo info)
    {
        if (this.Keys.Contains(key))
        {
            this[key].Add(info);
        }
        else
        {
            this.Add(key, new List<NodeInfo>() { info } );
        }
    }

    public NodeInfo FirstNode()
    {
        if (this.Count == 0)
            return null;
        return this.First().Value.First();
    }
}

public class NodeInfo
{
    public string Info { get; set; }
    // TODO: add other members
}

Here's some sample usage:

var list = new NodeList();

// adding
list.Add(3, new NodeInfo() { Info = "some info 3" });

// inserting
for (int i = 0; i < 100000; i++)
{
    list.Add(1, new NodeInfo() { Info = "some info 1" });
    list.Add(2, new NodeInfo() { Info = "some info 2" });
    list.Add(1, new NodeInfo() { Info = "some info 1.1" });
}

// retrieving the first item
var firstNodeInfo = list.FirstNode();

// retrieving an item
var someNodeInfo = list[2].First();
Alex Filipovici
  • 31,789
  • 6
  • 54
  • 78
0

In my opinion, it is acceptable to use a normal list and re-sort it after every insert. Sorting is pretty efficient in .NET. See this thread : String sorting performance degradation in VS2010 vs. VS2008

Community
  • 1
  • 1
cvraman
  • 1,687
  • 1
  • 12
  • 18
  • Not when you have 100 million items in the list. And, yes, some people really do work with lists that large in memory. – Jim Mischel May 13 '13 at 13:22
  • True. If you are manipulating 100 million items and want the list to be always sorted , you will have to go for your own custom implementation of a data structure. – cvraman May 13 '13 at 13:32
0

You can use OrderedMultiDictionary in Wintellect's Power Collections for .NET. That's exactly what you are looking for.

Kaveh Shahbazian
  • 13,088
  • 13
  • 80
  • 139