48

I have read this in answer to many questions on here. But what exactly does it mean?

var test = new Dictionary<int, string>();
test.Add(0, "zero");
test.Add(1, "one");
test.Add(2, "two");
test.Add(3, "three");

Assert(test.ElementAt(2).Value == "two");

The above code seems to work as expected. So in what manner is a dictionary considered unordered? Under what circumstances could the above code fail?

Menuka Ishan
  • 5,164
  • 3
  • 50
  • 66
fearofawhackplanet
  • 52,166
  • 53
  • 160
  • 253
  • 5
    Even if a particular test succeeds the order of dictionaries is not guaranteed and so cannot be relied upon in general. – sorpigal Jun 17 '11 at 10:57
  • 2
    @Sorpigal: yes, but *why?* and *how?* – fearofawhackplanet Jun 17 '11 at 10:58
  • 3
    A Dictionary is not *unordered*, it's just not necessarily ordered, which is not the same. – Petruza Jun 17 '11 at 12:34
  • The why and how is covered in http://stackoverflow.com/questions/4634223/how-do-i-access-the-nth-item-in-a-dictionary-or-hash – Simon Whitaker Jun 17 '11 at 12:42
  • @Jon Skeets answer is a good one - but worth noting that if you *need* order you can use an [Ordered Dictionary](http://msdn.microsoft.com/en-us/library/system.collections.specialized.ordereddictionary.aspx) instead – Matt Wilko Jun 17 '11 at 13:53
  • possible duplicate of [Why are entries in addition order in a .Net Dictionary?](http://stackoverflow.com/questions/154307/why-are-entries-in-addition-order-in-a-net-dictionary) – nawfal Oct 31 '13 at 07:51
  • Random reader may be interested in a [**SortedDictionary**](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.sorteddictionary-2?view=net-5.0) – Sold Out May 04 '21 at 15:43

7 Answers7

78

Well, for one thing it's not clear whether you expect this to be insertion-order or key-order. For example, what would you expect the result to be if you wrote:

var test = new Dictionary<int, string>();
test.Add(3, "three");
test.Add(2, "two");
test.Add(1, "one");
test.Add(0, "zero");

Console.WriteLine(test.ElementAt(0).Value);

Would you expect "three" or "zero"?

As it happens, I think the current implementation preserves insertion ordering so long as you never delete anything - but you must not rely on this. It's an implementation detail, and that could change in the future.

Deletions also affect this. For example, what would you expect the result of this program to be?

using System;
using System.Collections.Generic;

class Test
{ 
    static void Main() 
    {
        var test = new Dictionary<int, string>();
        test.Add(3, "three");
        test.Add(2, "two");
        test.Add(1, "one");
        test.Add(0, "zero");

        test.Remove(2);
        test.Add(5, "five");

        foreach (var pair in test)
        {
            Console.WriteLine(pair.Key);
        }
    }     
}

It's actually (on my box) 3, 5, 1, 0. The new entry for 5 has used the vacated entry previously used by 2. That's not going to be guaranteed either though.

Rehashing (when the dictionary's underlying storage needs to be expanded) could affect things... all kinds of things do.

Just don't treat it as an ordered collection. It's not designed for that. Even if it happens to work now, you're relying on undocumented behaviour which goes against the purpose of the class.

Kounavi
  • 1,090
  • 1
  • 12
  • 24
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I'd expect "three", just as I would if it was a `List>`. – fearofawhackplanet Jun 17 '11 at 11:04
  • 1
    @fearofawhackplanet: Okay, so you're expecting insertion order. And what would you expect in my second example? – Jon Skeet Jun 17 '11 at 11:05
  • 1
    Thanks for the update. It's a good example, actually I guessed the resulting order correctly but I see your point. The dictionary *is* ordered but there is no guarantee of *how* it is ordered. When people talk about having "no order", I was imagining a situation like having subsequent calls to `ElementAt` returning different results. – fearofawhackplanet Jun 17 '11 at 11:39
  • 2
    Dictionaries are most often ordered in whatever order is most efficient for fetching a value. They're lookup tables. It would appear that in C# insertion order is kept unless the dictionary is modified, but in for example Python, it is ordered by the hash of the key value, so that quick reads can be made. Anyway, what Jon said: don't ever trust the order of a dictionary; it can vary completely between runs, implementations and architectures. – Blixt Jun 17 '11 at 12:40
  • 1
    @Blixt. No, that's not quite true. If you do the same sequence of insertions in two successive runs, it had BETTER be the same! There seems to be a lot of confusion on this point – Dov Jun 17 '11 at 17:54
  • 3
    @Dov: I disagree. Suppose it were ordered by hash code, and nothing in `Foo` overrides `GetHashCode`... then successive runs which add new instances of `Foo` could easily show different orders. Of course, it depends on what you mean by "the same sequence of insertions" - but I see nothing which tries to guarantee that the order "had BETTER be the same" - nor would I want to rely on it. – Jon Skeet Jun 17 '11 at 17:58
  • 1
    @Jon That makes sense if you have object references, but for the example shown here, the hash is based on the values. What you are saying is that if you run again with DIFFERENT OBJECT ADDRESSES, you could expect a different order (true). But if you build the same Dictionary twice, with the same objects (ie allocated only once), it will have the same order. I realize that this does mean you are right for two runs, if you naively assume that allocating an object is "the same" between runs, but the problem is with the assumption, not my statement. – Dov Jun 17 '11 at 18:04
  • 1
    @Dov: I think it was the ambiguity in the statement that was the problem :) What about inserting the same sequence of objects, but using a different initial capacity when creating the dictionary? That could easily have an impact on the ordering, if it depends on how many reallocations are performed. Again, I'd really never want to *rely* on this... – Jon Skeet Jun 17 '11 at 18:14
  • 3
    Here is an article that describes how dictionary order can change without changing content: http://blogs.msdn.com/b/ericlippert/archive/2011/05/23/read-only-and-threadsafe-are-different.aspx – adrianm Jun 19 '11 at 19:43
  • The other point to make about relying on the order of dictionary items is that as it's an implementation detail, there's no guarantee it will be consistent across different versions of the Framework. So relying on the order could produce code that gives different results in different environments. – Niall Connaughton Aug 26 '13 at 01:53
  • 1
    I can confirm that insertion order can't be relied upon. I just spent hours tracking a bug due to reliance on insertion order in some library code that I wrote almost 5 years ago. Worked great up until this one particular case. There are two dictionaries with the same items added in the same order, but they don't come back out the same way depending on the circumstances. – A.R. Aug 11 '15 at 23:44
  • Updated link to comment by @adrianm - https://learn.microsoft.com/en-gb/archive/blogs/ericlippert/read-only-and-threadsafe-are-different – openshac Nov 26 '20 at 07:25
27

A Dictionary<TKey, TValue> represents a Hash Table and in a hashtable there is no notion of order.

The documentation explains it pretty well:

For purposes of enumeration, each item in the dictionary is treated as a KeyValuePair structure representing a value and its key. The order in which the items are returned is undefined.

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
  • 4
    Hash tables are optimised for random rather than sequential access. They sacrifice the ordering for faster access. – Noufal Ibrahim Jun 17 '11 at 10:59
  • 2
    +1 Thinking about it having *undefined order* instead of being *unordered* makes more sense to me. These language terms do not mean exactly the same thing imho. – fearofawhackplanet Jun 17 '11 at 11:42
10

There's a lot of good ideas here, but scattered, so I'm going to try to create an answer that lays it out better, even though the problem has been answered.

First, a Dictionary has no guaranteed order, so you use it only to quickly look up a key and find a corresponding value, or you enumerate through all the key-value pairs without caring what the order is.

If you want order, you use an OrderedDictionary but the tradeoff is that lookup is slower, so if you don't need order, don't ask for it.

Dictionaries (and HashMap in Java) use hashing. That is O(1) time regardless of the size of your table. Ordered dictionaries typically use some sort of balanced tree which is O(log2(n)) so as your data grows, access gets slower. To compare, for 1 million elements, that's on the order of 2^20, so you'd have to do on the order of 20 lookups for a tree, but 1 for a hash map. That's a LOT faster.

Hashing is deterministic. Non-determinism means when you hash(5) the first time, and you hash(5) the next time, you get a different place. That would be completely useless.

What people meant to say is that if you add things to a dictionary, the order is complicated, and subject to change any time you add (or potentially remove) an element. For example, imagine the hash table has 500k elements into it, and you have 400k values. When you add one more, you reach the critical threshhold because it needs about 20% empty space to be efficient, so it allocates a bigger table (say, 1 million entries) and re-hashes all the values. Now they are all in different locations than they were before.

If you build the same Dictionary twice (read my statement carefully, THE SAME), you will get the same order. But as Jon correctly says, don't count on it. Too many things can make it not the same, even the initially allocated size.

This brings up an excellent point. It is really, really expensive to have to resize a hashmap. That means you have to allocate a bigger table, and re-insert every key-value pair. So it is well worth allocating 10x the memory it needs rather than have even a single grow have to happen. Know your size of hashmap, and preallocate enough if at all possible, it's a huge performance win. And if you have a bad implementation that doesn't resize, it can be a disaster if you pick too small of a size.

Now what Jon argued with me about in my comment in his answer was that if you add objects to a Dictionary in two different runs, you will get two different orderings. True, but that's not the dictionary's fault.

When you say:

new Foo();

you are creating a new object at a new location in memory.

If you use the value Foo as the key in a dictionary, with no other information, the only thing they can do is use the address of the object as the key.

That means that

var f1 = new Foo(1);
var f2 = new Foo(1);

f1 and f2 are not the same object, even if they have the same values.

So if you were to put them into Dictionaries:

var test = new Dictionary<Foo, string>();
test.Add(f1, "zero");

don't expect it to be the same as:

var test = new Dictionary<Foo, string>();
test.Add(f2, "zero");

even if both f1 and f2 have the same values. That has nothing to do with the deterministic behavior of the Dictionary.

Hashing is an awesome topic in computer science, my favorite to teach in data structures.

Check out Cormen and Leiserson for a high end book on red-black trees vs. hashing This guy named Bob has a great site about hashing, and optimal hashes: http://burtleburtle.net/bob

nawfal
  • 70,104
  • 56
  • 326
  • 368
Dov
  • 8,000
  • 8
  • 46
  • 75
4

The order is non-deterministic.

From here

For purposes of enumeration, each item in the dictionary is treated as a KeyValuePair structure representing a value and its key. The order in which the items are returned is undefined.

Maybe for your needs OrderedDictionary is the required.

V4Vendetta
  • 37,194
  • 9
  • 78
  • 82
  • 1
    The order is definitely undefined, but it is probably deterministic on most implementations . – Brian Jun 17 '11 at 13:35
  • The order is of course, deterministic. What you mean is that the order can change at any time if a value is inserted or deleted. That's a very different thing – Dov Jun 17 '11 at 17:50
  • If you add the same values to a dictionary, they will result in the same order in the Dictionary. That's deterministic. Otherwise, you have a horribly buggy hash table. – Dov Jun 17 '11 at 17:50
0

The class Dictionary<TKey,TValue> is implemented using an array-backed index-linked list. If no items are ever removed, the backing store will hold items in order. When an item is removed, however, the space will be marked for reuse before the array is expanded. As a consequence, if e.g. ten items are added to a new dictionary, the fourth item is deleted, a new item is added, and the dictionary is enumerated, the new item will likely appear fourth rather than tenth, but there is no guarantee that different versions of Dictionary will handle things the same way.

IMHO, it would have been helpful for Microsoft to document that a dictionary from which no items are ever deleted will enumerate items in the original order, but that once any items are deleted, any future changes to the dictionary may arbitrary permute the items therein. Upholding such a guarantee as long as no items are deleted would be relatively cheap for most reasonable dictionary implementations; continuing to uphold the guarantee after items are deleted would be much more expensive.

Alternatively, it might have been helpful to have an AddOnlyDictionary which would be thread-safe for a single writer simultaneous with any number of readers, and guarantee to retain items in sequence (note that if items are only added--never deleted or otherwise modified--one may take a "snapshot" merely by noting how many items it presently contains). Making a general-purpose dictionary thread-safe is expensive, but adding the above level of thread-safety would be cheap. Note that efficient multi-writer multi-reader usage would not require use of a reader-writer lock, but could simply be handled by having writers lock and having readers not bother to.

Microsoft didn't implement an AddOnlyDictionary as described above, of course, but it's interesting to note that the thread-safe ConditionalWeakTable has add-only semantics, probably because--as noted--it's much easier to add concurrency to add-only collections than to collections which allow deletion.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

Dictionary< string, Obj>, not SortedDictionary< string, Obj >, is default to sequence by the insertion order. Strange enough you need to specifically declare a SortedDictionary to have a dictionary that sorted by key string order:

public SortedDictionary<string, Row> forecastMTX = new SortedDictionary<string, Row>();
Jenna Leaf
  • 2,255
  • 21
  • 29
  • 2
    I think it is a good and professional manner when you downvote somebody you will give an explanation even you don't like that person. Please let me know through StackOverflow message utility why you gave me a downvote. Don't just down vote me and not saying anything, keeping me in the dark what I have done wrong because this is a learning forum for everybody. – Jenna Leaf Sep 26 '16 at 16:57
0

I don't know C# or any of .NET, but the general concept of a Dictionary is that it's a collection of key-value pairs.
You don't access sequentially to a dictionary as you would when, for example, iterating a list or array.
You access by having a key, then finding whether there's a value for that key on the dictionary and what is it.
In your example you posted a dictionary with numerical keys which happen to be sequential, without gaps and in ascending order of insertion.
But no matter in which order you insert a value for key '2', you will always get the same value when querying for key '2'.
I don't know if C# permits, I guess yes, to have key types other than numbers, but in that case, it's the same, there's no explicit order on the keys.
The analogy with a real life dictionary could be confusing, as the keys which are the words, are alphabetically ordered so we can find them faster, but if they weren't, the dictionary would work anyway, because the definition of the word "Aardvark" would have the same meaning, even if it came after "Zebra". Think of a novel, in the other hand, changing the order of the pages wouldn't make any sense, as they are an ordered collection in essence.

Petruza
  • 11,744
  • 25
  • 84
  • 136