3

A similar question to the below has been asked with specific reference to Dictionary here: Does the Enumerator of a Dictionary<TKey, TValue> return key value pairs in the order they were added? and here: Dictionary enumeration order

Reading these it is clear that the ordering of enumeration of Dictionary should not be relied upon. In line with the non-deterministic order of enumeration of a Dictionary, I recently observed unit tests failing intermittently (on a build machine) when a test project was built targeting .NET Core 3.1 (in a branch). In contrast, the same test project built targeting .NET Framework 4.7.2 (on a different branch) had no failures. These observations were over many separate unit test executions. Eventually I traced the failures to numerical operations (summation over 1/x) where the values (x's) were stored in an ImmutableDictionary keyed with a String. In the case of the unit test the order of summation affected the result. A fix has been applied to the calculations: the use of an ImmutableSortedDictionary.

A cut-down code snippet that demonstrates the different ordering of keys in ImmutableDictionary is here (compile targeting .NET Core 3.1 and execute multiple times to observe the different enumeration):

static void Main(string[] args)
{
    var dict = ImmutableDictionary<string,double>.Empty;
    for (int i = 0; i < 10; i++)
    {
        dict = dict.Add(i.ToString(),i);
    }
            
    Console.WriteLine("Keys collection: " + string.Join(", ",dict.Keys.ToList()));
    Console.WriteLine("Keys during enumeration: " +string.Join(", ", dict.Select(c => c.Key).ToList()));
}

However, as noted in answers to questions about Dictionary: "a Dictionary does return items in the same order (assuming that you don't trigger a resize of the hashtable)". Again, I understand that the current ordering behaviour should not be relied upon but it isn't clear in what situations (e.g. when using .NET Framework, .NET Standard, .NET Core) the ordering actually differs between executions. My question is:

Why does an ImmutableDictionary (in .NET Framework 4.7.2) return items in the same order between executions but an ImmutableDictionary (in .NET Core 3.1) consistently return items in a different order?

anthls
  • 504
  • 4
  • 12
  • 4
    Do you ***really*** want to know why? without digging through the source and doing comparison, I guess the simplest answer is they are different code bases. Yet above and beyond the behaviour shouldn't be relied upon as its an implementation detail (as you have seen) – TheGeneral Jul 16 '20 at 06:07
  • 6
    Every day I go into the shop and see the chocolate bars ordered alphabetically. I ask the shopkeeper does he promise to do that every day? He says no. Does it make sense to go the next day and ask why they aren't in order today? No - because he **explicitly said** he didn't promise that. **The whole point of saying order is not deterministic is to avoid you relying on it**. The danger of answering the question you are asking is you will rely on the implementation detail _which may change_. – mjwills Jul 16 '20 at 06:24
  • 2
    `a Dictionary does return items in the same order (assuming that you don't trigger a resize of the hashtable)"` It is more accurate to say `a Dictionary does currently return items in the same order (assuming that you don't trigger a resize of the hashtable) in the current implementations and runtimes, but that may change in future".` – mjwills Jul 16 '20 at 06:27
  • I have added the clarification that the ImmutableDictionary was using `String` for the keys. Thanks to @ilyatchivilev for pointing out the issue with the .NET Core hash function. – anthls Jul 16 '20 at 07:24
  • Thank you @mjwills - I hope I was clear in saying that I changed the code to ensure it does not rely on the undefined ordering (clearly best practice). However, I ask the question because others may have the same issue and I am now curious: does anyone know which versions/platforms consistently return the same ordering between executions? – anthls Jul 16 '20 at 07:53
  • @mjwills I want to know because I spent a long time trying to figure out why the tests were failing. I didn't write the test cases that were intermittently failing nor the code that was sensitive to the ordering. I searched for help and found the answers I listed in the question. They indicated that, despite enumeration order being "undefined", the order of enumeration *is* deterministic. However, when a different platform (.NET Core 3.1) was targeted different ordering was observed between executions. If I had known that at the start of investigating the failing tests it would have helped. – anthls Jul 16 '20 at 08:02
  • 1
    `They indicated that, despite enumeration order being "undefined", the order of enumeration is deterministic.` You act as if the two statements are in contradiction to each other. `undefined` doesn't mean `random`. It means `I promise nothing`. You say the order is deterministic - the order is specifically **not** guaranteed to be deterministic **according to the contract**. **If you are relying on order then you are doing it wrong**. The contract specifically says "don't do that". – mjwills Jul 16 '20 at 08:10
  • 1
    `does anyone know which versions/platforms consistently return the same ordering between executions?` The short answer is this - **none of them promise to do that**. So any tests should be written based on the contract, not observed behaviour. `Why does an ImmutableDictionary (in .NET Framework 4.7.2) return items in the same order between executions but an ImmutableDictionary (in .NET Core 3.1) consistently return items in a different order?` Because the contract does not say it _can't_ do that. If it is unordered then, by definition, it can store / order / return itself however it likes. – mjwills Jul 16 '20 at 08:19

2 Answers2

5

Because the hash function for "string" in .NET Core is non-deterministic.

The issue here depends on the key type that you're using. If you're using string for the key type (I'm making an educated guess here that that's what you're using), in .NET Core you'll run into the issue that the hash code for the same string is different on each application execution.

You can read more about it here

In .NET Framework the same strings generated the same hash codes on each execution, so their order always remained the same during enumeration.

For your situation, you could try switching to a type where either you have a deterministic hash function either by the type itself (eg int) or supplying a type with a custom hash function.

There is a follow up question though in the original question - why is it that Dictionary<string,x> enumerates deterministically, but ImmutableDictionary<string,x> enumerates non deterministically, if both are keyed on strings, and strings generate different hashes on each application execution.

The answer here is how the enumerator works for each type. For the Dictionary<TKey,TValue> in Core, there are essentially two collections, the hashes, and the entries (see the diagrams in this article). The enumeration of Dictionary uses the entries, and by and large the entries appear in the order they were added, so it has nothing to do with the hashing function. The enumerator code you can see in the custom enumerator of KeyCollection of Dictionary here.

However for the ImmutableDictionary, the enumeration follows the hashes (see the HashBucket.Enumerator that is called in the ImmutableDictionary). So in Framework, where strings hashed consistently, everything was fine, the hashes retained their order. Now in Core though, using a string key, the hashes are different on each run, they evaluate to different positions, their order is hence different.

Hope that covers it.

Ilya Tchivilev
  • 822
  • 7
  • 10
  • Or you could not rely on undefined behavior. – Aluan Haddad Jul 16 '20 at 07:31
  • This is a great answer! Thank you. However, why does it not affect `Dictionary` too? That is, if I change from `ImmutableDictionary` to `Dictionary` (with some other slight modifications) the ordering is preserved between execution. – anthls Jul 16 '20 at 07:44
  • 2
    `In .NET Framework the same strings generated the same hash codes on each execution, so their order always remained the same during enumeration.` How did you come to the conclusion that the hash impacts ordering? Is that a guarantee? _I am interested to read more about this._ – mjwills Jul 16 '20 at 07:54
  • 1
    @mjwills You can run the example code in the original question. Multiple executions of that code when compiled for .NET result in the same output each time. In .NET Core it's a different order each time. On top of this, if you try something like "a".GetHashCode() in .NET Framework, multiple application runs will give you the same hashcode. Multiple executions of that in .NET Core will give different hash codes. And lastly, if you use a type that has a deterministic hash code for the key type, the order stays the same no matter which .NET you use. – Ilya Tchivilev Jul 16 '20 at 08:02
  • Gotcha - so you are basing this on observed behaviour rather than contract? – mjwills Jul 16 '20 at 08:13
  • 3
    @mjwillis This is based on behaviour. I agree that relying on the order of a Dictionary is wrong, it's undefined by contract. But the question still stands WHY is there an implementation difference between the frameworks, and what is causing these differences. It's more of a curiosity, what makes the Framework implementation deterministic and non-deterministic in Core. It's fun to work out what's going on inside. these things. – Ilya Tchivilev Jul 16 '20 at 08:27
  • Totally agree. It's a curiosity, fun and, potentially, we may help other people if they observe the same behaviour. This may lead to them checking if their (or their team's code) adheres to the contract of undefined enumeration ordering for `Dictionary` and `ImmutableDictionary` more quickly. – anthls Jul 16 '20 at 08:45
0

Lists are deterministic in order, but dictionaries are not. So solve the issue, you can make a hybrid of the two:

public class DeterministicThing<TKey, TValue> :
    IEnumerable<KeyValuePair<TKey, TValue>>
    where TKey : notnull
{
    private readonly ImmutableList<KeyValuePair<TKey, TValue>> kvpList;
    private readonly ImmutableDictionary<TKey, TValue> kvpDictionary;

    public DeterministicThing(IEnumerable<KeyValuePair<TKey, TValue>> kvps)
    {
        kvpList = kvps.ToImmutableList();
        kvpDictionary = kvps.ToImmutableDictionary(pair => pair.Key, pair => pair.Value);
    }

    public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator() => kvpList.GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator() => kvpList.GetEnumerator();
    public TValue this[TKey key] => kvpDictionary[key];

    public int Count => kvpList.Count;
}
Christian Findlay
  • 6,770
  • 5
  • 51
  • 103