18

I wonder, what are the differences in collection implementations in .NET .

For instance, I constantly use List<int> etc to store a list of items. However I just need a container for items, and I guess I don't need all the features that List have. I just need an container that has put method, and will enable client code to iterate over the container.

Are there any quicker, more lightweight collection implementations that implement IEnumerable<T> in .NET?

Mathew Thompson
  • 55,877
  • 15
  • 127
  • 148
dragonfly
  • 17,407
  • 30
  • 110
  • 219
  • Do you have an indication that the weight of `List` is a problem in your program? – CodesInChaos May 30 '12 at 11:48
  • 1
    No :) But I just wanted to know. Recently I had a question during interview : What is the most lightweight / fast etc way of storing collection of data in .NET :) – dragonfly May 30 '12 at 11:52
  • 1
    @dragonfly The real answer: Does it actually matter? Am I in a situation where one needs to vastly outperform the others? If the answer is yes, then perhaps going a little lower level is better. – George Stocker May 30 '12 at 11:55
  • So you need only `Add` method and do not need a search, access by index? – sll May 30 '12 at 11:56
  • 7
    Those questions annoy the crap out of me, they usually indicate a senior developer with an overblown ego. In 99.9% of cases it doesn't matter which is the most lightweight, and if you are after maximum speed it takes 5 minutes of research to find a decent answer. – slugster May 30 '12 at 12:01

3 Answers3

10

The most lightweight container which implements IEnumerable<T> is a typed array. It has no Add method (and therefore does not dynamically resize like a list does) but if you know what number of elements you want up front you can define the array and insert elements at a given position.

var myArray = new int[10];
myArray[0] = 123;
myArray[1] = 234;
Jamiec
  • 133,658
  • 13
  • 134
  • 193
  • Whilst it does not dynamically resize, you **have to specify the size of the array**, thus meaning that it's usually always reserving more memory than it needs – Mathew Thompson May 30 '12 at 11:55
  • 1
    see here http://stackoverflow.com/questions/434761/array-versus-listt-when-to-use-which – Mathew Thompson May 30 '12 at 11:56
  • I think this is the technically correct answer to the silly interview question, assuming you have the correct size. :) – Andrew Barber May 30 '12 at 11:57
  • 1
    @mattytommo so is all the collections rather often. Eg. a list uses an array internally. If that array fills a new array will be made with twice as many elements.So you are probably allocating far more memory than you need using a list too – Rune FS May 30 '12 at 12:05
  • I guess it's the lighest implementation of IEnumerable when you need to return an empty collection by just returning new int[0] – Ciprian Teiosanu Jun 28 '18 at 15:48
  • Technically it does actually have an Add method (as in reflection will find a method called Add) because array for odd reasons implements the IList interface. However calling the method will throw an exception – Rune FS Aug 02 '18 at 21:30
5

If you only need add and iteration I would believe that a linked list to be the most light weight. I haven't looked at the implementation of Stack put it's rather easy to create one that doesn't take to much space. It's a rather naïve solution that could be optimized for size. My point is that it's simple to implement a leightweight collection

public class Stack<T>{
   readonly T _head;
   readonly Stack<T> _tail;
   public Stack(T head,Stack<T> tail){
       _head = head;
       _tail = tail;
   }

   public Stack<T> Push(T element){
       return new Stack<T>(element,this);
   }

   public IEnumerator<T> GetEnumerator(){
         yield return _head;
         var current = _tail;
         while(_tail != null){
             yield return current._head;
             current = current._tail;
         }
   }
}

Performancewise this will be slower than an implementation that uses a preallocated array since assigning to an element is faster than newing a new object and depending on how filled the internal array of eg. list is this might actually take up more space but that overhead can be traded for performance by newing a new array each time with just one more element but that has a significant overhead performancewise. You could also opt for a balance between the two where you keep sufficient space for a number of new elements overallocaating memory in most cases but increasing performance in most cases.

public class Stack<T>{
   T[] _heads;
   T[] _tail;
   int _next;
   public Stack(T head,T[] tail){
       _heads = new T[tail.length];
       _next = _heads.Length -2; 
       _heads[_heads.Length -1] = head;
       _tail = tail;
   }

   public void Push(T element){
        if(_next < 0){
            var length = _heads.Length;
            var _new = new T[length * 2];
            _heads = new T[length * 2];                
            _next = length * 2-1;
            Array.Copy(_heads,_new,length);
            Array.Copy(_tails,0,_new,length,length);
            _tails = _new;
        } else{
            _heads[_next--] = element;
        }
   }

   public IEnumerator<T> GetEnumerator(){
         yield return _head;
         var current = _tail;
         while(_tail != null){
             yield return current._head;
             current = current._tail;
         }
   }
}

and the you are basically back to the balance that collection such as List have. It's build on an internal array which is often too big to allow for high speed additions while not wasting too much memory.

So as with all optimization questions it really depends on what you wish to optimize for. You can often optimize for memory if you are willing to sacrifice performance and vice versa

Rune FS
  • 21,497
  • 7
  • 62
  • 96
2

What exactly are you storing in the collection? If it's a type (T), then List<T> is your best bet yes.

For non-generic types, consider using a HashSet, they can significantly improve performance in certain cases.

AS @RuneFS said, as you are asking for iteration and put. Iterating a HashSet is equal(ish) to iterating List however adding an object to HashSet is slower (comparing hashes) than adding it to a list and functionaly not equivalent either (distinct hashes in hashset).

Mathew Thompson
  • 55,877
  • 15
  • 127
  • 148
  • whats a "generic type"?? the `T` in `List` is just any type you want to store in there, be it an `int`, a `string` or a `Person`! – Jamiec May 30 '12 at 11:49
  • A generic type is anything you made :) like `Person` :D – Mathew Thompson May 30 '12 at 11:51
  • But why only for non-generic types? For generic types, I would recommend using `HashSet`. – fero May 30 '12 at 11:52
  • So the correct wording of your second sentence is `If it's any type (T), then List is your best bet` – Jamiec May 30 '12 at 11:52
  • @fero That depends, performance of HashSet is based on the O(1) performance characteristic of hash lookups. Operations which don't make use of the members' hash codes, like filtering on a member property vs. the members themselves, will need to be an O(N) operation, making it the same as a List. – Mathew Thompson May 30 '12 at 11:52
  • @mattytommo You're right. But there is no disadvantage in using a `HashSet` then, is it? There is only an advantage when using methods like `Contains` which use the hash code of the objects. – fero May 30 '12 at 11:55
  • the answer is worded slightly clumsily, imo; every type is a "generic type" as far as this discussion is concerned. But +1 for noting the case where HashSet is more performant. However, OP was asking *lightweight*, not necessarily performant. Nitpicky, I know! – Andrew Barber May 30 '12 at 11:56
  • 1
    @AndrewBarber Yeah I've cleaned that up :) wasn't happy with it myself! – Mathew Thompson May 30 '12 at 11:58
  • @fero If you need to guarantee the order of items, use a List. If you don't, use a HashSet. – Mathew Thompson May 30 '12 at 12:00
  • 1
    OP is aking for iteration and put. Iterating a HashSet is equal(ish) to iterating List however adding an object to HashSet is slower (comparing hashes) than adding it to a list and functionaly not equivalent either (distinct hashes in hashset) – Rune FS May 30 '12 at 12:02