1

I already read official MSDN doc and multiple SO question regarding Dictionary/Hashset and other unordered collections order issues. This question is a bit different - I'm looking for actual practical examples instead of theory

Conditions:

  • a dictionary/hastable entry CAN ONLY BE ADDED
  • a dictionary/hastable entry IS NEVER REMOVED (unless Cleared at beginning)
  • a dictionary/hastable is operating with VALUE TYPES only
  • a dictionary/hastable is never changed in any other way but Adding elements
  • a dictionary/hastable always starts at 0 and index is preserved through simple code logic
  • a dictionary key itself (the key name) is never changed manually

Code Sample:

Dictionary<int, int> test = new Dictionary<int, int>();
for (int i = 0; i <= 100000; i++)
{
    test.Add(i,i);
}
var c = 0;
while (c < test.Count)
{
    if (c != test.ElementAt(c).Key)
    {
        Console.WriteLine("Mismatch!");
        break;
    }
    c++;
}
Console.WriteLine("Test passed!");
//Program takes a 2hrs walk, does things, occasionally reads-only values from existing "test" collection
for (int i = 0; i <= 500000; i++)
{
    test[test.Count] = test.Count;
}
c = 0;
while (c < test.Count)
{
    if (c != test.ElementAt(c).Key)
    {
        Console.WriteLine("Mismatch!");
        break;
    }
    c++;
}
Console.WriteLine("Test 2 passed!");
//repeat some stuff 

Question: UNDER THE GIVEN STRICT RULES ABOVE - has anyone actually encountered a case when this kind of Dictionary randomly changed order of its elements? We aren't talking about MT Concurrent Collection here and again, I'm not interested in theory answers, I've read them.

A single example when test.ElementAt(5).key would return key being 11 or 115 is what I'm looking for.

dbc
  • 104,963
  • 20
  • 228
  • 340
Digika
  • 127
  • 2
  • 11
  • Well the source code for `Dictionary` is available [here](http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,3b9a0882313262cd). If you look at [`Enumerator.MoveNext()`](http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,d864a2277aad4ece,references) you will see that it returns the `dictionary.entries` in order. – dbc Dec 12 '17 at 20:03
  • And if you look at [`private void Insert(TKey key, TValue value, bool add)`](http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,fd1acf96113fbda9,references) you will see that *as long as there are no free entries* then added items are placed at the end of the `entries` array, and that rehashing does not change this order. So it appears that **with this implementation and as long as nothing is removed** the dictionary preserves the order in which things are added. – dbc Dec 12 '17 at 20:04
  • **But this is just an implementation detail.** The version of `Dictionary` on mono (or some future .Net version, say .Net core 3.5 or .Net full 5.2 or whatever) could be rewritten so that rehashing the dictionary changes the order. The only promise the [documentation](https://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.110).aspx) makes is that *The order in which the items are returned is undefined* so it would be unwise to rely on anything more. – dbc Dec 12 '17 at 20:07
  • So, given what you have said now, tell me if this statement is true: `"You can rely on Dictionary order under given restrictions to solve single current task at hand right now with its current implementation".` Not talking about long-term implementation that may break when DT/HT implementation breaks/changes. Also, post as an answer. – Digika Dec 12 '17 at 20:30

1 Answers1

2

Well the source code for Dictionary<TKey, TValue> is available here. If you look at Enumerator.MoveNext() you will see that it iterates through the array dictionary.entries in order:

while ((uint)index < (uint)dictionary.count) {
    if (dictionary.entries[index].hashCode >= 0) {
        current = new KeyValuePair<TKey, TValue>(dictionary.entries[index].key, dictionary.entries[index].value);
        index++;
        return true;
    }
    index++;
}

And if you look at private void Insert(TKey key, TValue value, bool add) (which is called when both adding and setting an item) you will see that as long as there are no free entries (i.e. nothing has been removed) then items are placed at the end of the entries array if the key is not found, or at the current location if the key is found:

private void Insert(TKey key, TValue value, bool add) {    
// Snip 
    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
    int targetBucket = hashCode % buckets.Length;

// Snip 
    for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
        if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
// Key found.  Set the new value at the old location in the entries array.
            if (add) { 
                ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
            }
            entries[i].value = value;
            version++;
            return;
        }  
        // Snip 
    }
// Key not found, add to the dictionary.
    int index;
    if (freeCount > 0) {
// Free entries exist because items were previously removed; recycle the position in the entries array.
        index = freeList;
        freeList = entries[index].next;
        freeCount--;
    }
    else {
// No free entries.  Add to the end of the entries array, resizing if needed.
        if (count == entries.Length)
        {
            Resize();
            targetBucket = hashCode % buckets.Length;
        }
        index = count;
        count++;
    }

// Set the key and value in entries array.
    entries[index].hashCode = hashCode;
    entries[index].next = buckets[targetBucket];
    entries[index].key = key;
    entries[index].value = value;
    buckets[targetBucket] = index;
// Snip remainder

Finally, if you check Resize(), the method called to rehash the dictionary, you will see that the order of the entries array is preserved:

    Array.Copy(entries, 0, newEntries, 0, count);

Thus we can say that with this implementation and as long as nothing is removed the dictionary preserves the order in which keys are added.

But this is just an implementation detail. The version of Dictionary<TKey, TValue> on mono (or some future .Net version, say .Net core 3.5 or .Net full 5.2 or whatever) could be rewritten so that rehashing the dictionary changes the order. The only promise the documentation makes is that The order in which the items are returned is undefined so it would be unwise to rely on anything more.

dbc
  • 104,963
  • 20
  • 228
  • 340