17

I have a concern about generic dictionaries using enums for keys.

As stated at the below page, using enums for keys will allocate memory: http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx

I've tested and confirmed the behavior, and it's causing problems in my project. For readability, I believe using enums for keys is very useful, and the optimal solution for me would be to write write a class implementing IDictionary<TKey, TValue>, which would use integers for keys internally. The reason is I don't want to change all my existing dictionaries to use integers for keys, and do implicit casting. This would be best performance wise, but it will give me lot of work initially and it will reduce the readability.

So I've tried a couple of approaches, including using GetHashCode (which unfortunately allocates memory) to build an internal Dictionary<int, TValue>.

So, to wrap it up in one question; can anyone think of a solution that I can use to keep the readability of Dictionary<SomeEnum, TValue>, while having the perfomance of a Dictionary<int, TValue>?

Any advice much appreciated.

Mark Jansen
  • 1,491
  • 12
  • 24
Magnus Andersson
  • 234
  • 1
  • 2
  • 10
  • 4
    That sounds like premature-optimization to me. What kind of application are you developing? – Tim Schmelter Oct 09 '14 at 14:20
  • 3
    I can't imagine how you could possibly do any better than an `enum` here. It will have identical performance implications to using an integer (assuming it's backed by one, which it will be by default). – Servy Oct 09 '14 at 14:20
  • Its mobile gaming, and it's causing lag spikes due to garbage collection. – Magnus Andersson Oct 09 '14 at 14:24
  • "It will have identical performance implications to using an integer". That is not true unfortunately, I've tested and confirmed. Using enums allocates memory when getting items from the dictionary. – Magnus Andersson Oct 09 '14 at 14:25
  • His statement about enums sounds like something you would say about Java, not .Net. .Net generics does not work like Java's. – Jeff Mercado Oct 09 '14 at 14:25
  • 2
    @MagnusAndersson An enum is just syntactic sugar for using an integer. It doesn't surprise me in the least that some amount of memory is used in getting the key from a dictionary. You're going to use some amount of memory to do anything, ever. Computers run on memory. You're simply not going to get better than what you have. If you do in fact have a measurable problem in your application, it's almost certainly not because you choose to use an `Enum` and not an integer as a key for a dictionary. – Servy Oct 09 '14 at 14:28
  • " GetHashCode (which unfortunately allocates memory) " - can you eloborate here? How does `GetHashCode` on an enum allocate memory? – D Stanley Oct 09 '14 at 14:30
  • @DStanley Well there's the stack frame for the method, and any locals (implicit and explicit) that it uses, to which I'm sure there are at least a few. – Servy Oct 09 '14 at 14:32
  • 3
    @Magnus, please show a simple program that demonstrates your contention that enums perform differently than ints. – Kirk Woll Oct 09 '14 at 14:34
  • @Servy Again, I've tested and confirmed that having an int for the key allocates 0 memory, while having an integer based enum allocates. – Magnus Andersson Oct 09 '14 at 14:36
  • I probably should have mentioned this is using the Mono framework, might be of interest... – Magnus Andersson Oct 09 '14 at 14:38
  • 1
    @Magnus, and *how* are you determining that "an int allocates 0 memory"? Please show us code that demonstrates this. – Kirk Woll Oct 09 '14 at 14:38
  • I'm using the internal memory profiler of the game engine we're using, which states that DefaultComparer.Equals() and DefaultComparer.GetHashCode() are both being called and allocates memory while getting an item from en enum-based dictionary. – Magnus Andersson Oct 09 '14 at 14:43
  • The corresponding call using an int-based dictionary reports 0 memory footprint. I'm not the one to guarantee the profiler is right, but I have no reason do doubt it. – Magnus Andersson Oct 09 '14 at 14:44
  • Show us your enum, your enum could be of type int or uint. So just cast your enum to int/uint on insert or access to your dictionary. problem solved. – user743414 Oct 09 '14 at 14:45
  • @MagnusAndersson Considering an `Enum` is nothing but compile time wrapping for an integer, you have *every* reason to doubt it. The issue is almost certainly with your testing, rather than with the code. – Servy Oct 09 '14 at 14:45
  • The problem with creating your own `IDictionary` where `S` is an enum type is that it's kind of difficult to [restrict a generic type to a enum](http://stackoverflow.com/questions/79126/create-generic-method-constraining-t-to-an-enum). If you have only a handful of enums to worry about, it might be worth your time to create a (bunch of) `MyEnumDictionary : IDictionary` for each enum. – Matt Burland Oct 09 '14 at 14:46
  • 1
    @MattBurland Yes, the fact that you're using Mono is extremely relevant. Looking at [`Comparer.cs`](https://github.com/mono/mono/blob/master/mcs/class/corlib/System.Collections.Generic/Comparer.cs), its implementation of `Comparer.Default` is quite a poor one. –  Oct 09 '14 at 14:50
  • @hvd Sorry for not stating that at the beginning. – Magnus Andersson Oct 09 '14 at 14:52
  • @Servy The lag spike caused by the garbage collection (caused by the dictionary look-up) is visible to the naked eye, and goes away when using int keys. Not sure how much more obvious it can be for me here, sorry you dont trust my judgement saying it's from that. – Magnus Andersson Oct 09 '14 at 14:55

4 Answers4

41

The problem is boxing. It's an act of turning value type into object, which might, or might not be unnecessary.

The way Dictionarycompares keys, is essentially, that it will use EqualComparer<T>.Default, and call GetHashCode() to find correct bucket, and Equals to compare if there's any value in the bucket that is equal tot he one we're looking for.

The good thing is this: .NET framework has good optimizations, which avoid boxing in the case of "Enum integers". See CreateComparer(). It's highly unlikely that you will see any difference here, between integers and enums, as keys.

To note here: this is not an easy act, in fact, if you dig in deep, you'll come to conclusion that quarter of this battle is implemented through CLR "hacks". As seen here:

   static internal int UnsafeEnumCast<T>(T val) where T : struct    
    {
        // should be return (int) val; but C# does not allow, runtime 
        // does this magically
        // See getILIntrinsicImplementation for how this happens.  
        throw new InvalidOperationException();
    }

It could be definitely easier if generics had Enum constraint, and perhaps even something a long of the lines UnsafeEnumCast<T>(T val) where T : Enum->Integer, but well... they don't.

You might be wondering, what exactly is going on in getILIntrinsicImplementation for that EnumCast? I wonder too. Not exactly sure as of this right moment how to check it. It's replaced on run-time with specific IL code I believe?!

MONO

Now, answer to your question: yes you're right. Enum as a key on Mono, will be slower in a tight loop. It's because Mono does boxing on Enums, as far I can see. You can check out EnumIntEqualityComparer, as you can see, it calls Array.UnsafeMov that basically casts a type of T into integer, through boxing: (int)(object) instance;. That's the "classical" limitation of generics, and there is no nice solution for this problem.

Solution 1

Implement an EqualityComparer<MyEnum> for your concrete Enum. This will avoid all the casting.

public struct MyEnumCOmparer : IEqualityComparer<MyEnum>
{
    public bool Equals(MyEnum x, MyEnum y)
    {
        return x == y;
    }

    public int GetHashCode(MyEnum obj)
    {
        // you need to do some thinking here,
        return (int)obj;
    }
}

All you need to do then, is pass it to your Dictionary:

new Dictionary<MyEnum, int>(new MyEnumComparer());

It works, it gives you the same performance as it is with integers, and avoids boxing issues. The problem is though, this is not generic and writing this for each Enum can feel stupid.

Solution 2

Writing a generic Enum comparer, and using few tricks that avoids unboxing. I wrote this with a little help from here,

// todo; check if your TEnum is enum && typeCode == TypeCode.Int
struct FastEnumIntEqualityComparer<TEnum> : IEqualityComparer<TEnum> 
    where TEnum : struct
{
    static class BoxAvoidance
    {
        static readonly Func<TEnum, int> _wrapper;

        public static int ToInt(TEnum enu)
        {
            return _wrapper(enu);
        }

        static BoxAvoidance()
        {
            var p = Expression.Parameter(typeof(TEnum), null);
            var c = Expression.ConvertChecked(p, typeof(int));

            _wrapper = Expression.Lambda<Func<TEnum, int>>(c, p).Compile();
        }
    }

    public bool Equals(TEnum firstEnum, TEnum secondEnum)
    {
        return BoxAvoidance.ToInt(firstEnum) == 
            BoxAvoidance.ToInt(secondEnum);
    }

    public int GetHashCode(TEnum firstEnum)
    {
        return BoxAvoidance.ToInt(firstEnum);
    }
}

Solution 3

Now, there's a little problem with the solution#2, as Expression.Compile() is not that famous on iOS(no runtime code generation), and some mono versions don't have ?? Expression.Compile ?? (not sure).

You can write simple IL code that will take care of the enum conversion, and compile it.

.assembly extern mscorlib
{
  .ver 0:0:0:0
}
.assembly 'enum2int'
{
  .hash algorithm 0x00008004
  .ver  0:0:0:0
}

.class public auto ansi beforefieldinit EnumInt32ToInt
    extends [mscorlib]System.Object
{
    .method public hidebysig static int32  Convert<valuetype 
        .ctor ([mscorlib]System.ValueType) TEnum>(!!TEnum 'value') cil managed
    {
      .maxstack  8
      IL_0000:  ldarg.0
      IL_000b:  ret
    }
} 

In order to compile it into an assembly, you have to call:

ilasm enum2int.il /dll where enum2int.il is the text file containing IL.

You can now reference the given assembly(enum2int.dll) and call the static method, as such:

struct FastEnumIntEqualityComparer<TEnum> : IEqualityComparer<TEnum> 
    where TEnum : struct
{
    int ToInt(TEnum en)
    {
        return EnumInt32ToInt.Convert(en);
    }

    public bool Equals(TEnum firstEnum, TEnum secondEnum)
    {
        return ToInt(firstEnum) == ToInt(secondEnum);
    }

    public int GetHashCode(TEnum firstEnum)
    {
        return ToInt(firstEnum);
    }
}

It might seem to be killer code, but it avoids boxing, and it should give you better berformance on Mono.

Community
  • 1
  • 1
Erti-Chris Eelmaa
  • 25,338
  • 6
  • 61
  • 78
  • Thanks, nice suggestion. I guess there is no way of having this done generally, without having to write a comparer for each enum? – Magnus Andersson Oct 09 '14 at 15:00
  • I updated the post. If mono does not support that, you can write one generic class, and use that, instead of writing class for each enum: http://referencesource.microsoft.com/#mscorlib/system/collections/generic/equalitycomparer.cs#0f1674ddefef561e#references – Erti-Chris Eelmaa Oct 09 '14 at 15:09
  • findings after testing your suggestion is that memory footprint is reduced by 2/3, but to get it to 0 I need to cast the obj to an integer in GetHashCode(MyEnum obj) before getting the hash code. unfortunately that removes the option of writing a generic comparer, as we cant constrain to enums of type int. any ideas? – Magnus Andersson Oct 10 '14 at 08:22
  • I don't have benchmark capabilities available right now, but I believe the solution#2 is as close as you can possibly get. – Erti-Chris Eelmaa Oct 10 '14 at 11:22
  • Testing of solution 2 results in an "ArgumentException: method return type is incompatible" from the CreateDelegate call. I'm still digesting whats going on in this comparer, you have an idea how to fix the issue? The enum I'm using is definitely of integer type, so I dont know why I get the error. – Magnus Andersson Oct 10 '14 at 14:17
  • Yeah, that's what I thought. The `CreateDelegate` seemed fishy to me, although it worked. I've updated the solution#2, which uses another method, and which imo is better. Let me know how it goes, and I might even try to get this change integrated to Mono :p – Erti-Chris Eelmaa Oct 10 '14 at 14:59
  • As is, solution #2 doesn't compile in my Mono version. I added an empty string to Expression.Parameter(typeof(TEnum)) to work around it, and tested. 0 memory allocation! – Magnus Andersson Oct 10 '14 at 15:06
  • The #2 solution above works very well using Monotouch for Android and PC standalone. However, due to limitations on iOS (no JIT compile) it does not run on iOS. Can anyone think of a similar solution that would actually work on iOS? – Magnus Andersson Oct 21 '14 at 08:32
  • @MagnusAndersson: Mono has what is known as AOT, but it currently doesn't work with generics, that's why it fails. http://www.mono-project.com/docs/advanced/aot/ To actually create a solution that should work theoretically on all the platforms, is to write code in IL and use ilasm. See solution#3 – Erti-Chris Eelmaa Oct 21 '14 at 09:47
  • Solution #3 did work on all platforms, extremely useful. – Magnus Andersson Oct 23 '14 at 14:54
  • Why does the .NET source need to do those unsafe casts in its comparers (UnsafeEnumCast(x)) if we can just do x == y for our custom comparer? – frontsidebus Jul 23 '16 at 08:39
  • Also the link to the mono source is dead - it shows 404 page on github. – frontsidebus Jul 23 '16 at 08:51
  • .Net core also handles this. See [ComparerHelpers.cs](https://github.com/dotnet/coreclr/blob/84092471c678da00bb59d86fb5fcd0890a524472/src/mscorlib/src/System/Collections/Generic/ComparerHelpers.cs#L91). – zwcloud Apr 14 '17 at 07:09
  • @Erti-ChrisEelmaa I have upvoted this answer for general reasons, but I do not see the reason for struct implementations of IEqualityComparer as it is handed to dictionary in constructor and boxing has to happen... – ipavlu Nov 15 '17 at 01:11
  • @ipavlu boxing can happen in constructor, that is fine. The idea of my post is that in order to speed up things, boxing should not happen when .Equals() method is called by dictionary internally. It does not happen on the custom struct. – Erti-Chris Eelmaa Nov 16 '17 at 15:30
  • @Erti-ChrisEelmaa Exactly the reason, why I upvoted as the primary part of the answer is correct. I just do not agree with this implicit boxing of IEqualityComparer for Dictionary constructor. Making IEqualityComparer struct in the first place brings no advantage, just complications and hides something to the developer. Why the develper should spent time thinking that it was boxed and as result he/she can – ipavlu Nov 17 '17 at 23:50
  • @Erti-ChrisEelmaa Damned edit rules, 5min timer took over:). What if you make a high number of these dictionaries? If it is a class, you can not miss its construction. If it is a struct, you can easily miss it. Making things explicit is an easy way of life, especially when the code must be read by other developers and when the extravagant choice brings only confusion and possible problems during refactoring or development in large teams... – ipavlu Nov 18 '17 at 00:00
1

I ran into this same problem a while back and ended up incorporating it into a library I wrote of generic enum extension and helper methods (it's written in C++/CLI (compiled AnyCPU) because C# doesn't allow creation of type constraints for enum types). It's available under the Apache 2.0 license on NuGet and GitHub

You can implement it in a Dictionary by grabbing the IEqualityComparer from the static Enums type in the library:

var equalityComparer = Enums.EqualityComparer<MyEnum>();
var dictionary = new Dictionary<MyEnum, MyValueType>(equalityComparer);

The values are handled without boxing, using a technique similar to the UnsafeEnumCast mentioned in one of the answers already provided (covered to death in tests since it is unsafe). As a result, it's very fast (since that would be the only point of replacing an equality comparer in this case). A benchmarking app is included as well as recent results generated from my build PC.

mdip
  • 600
  • 4
  • 10
1

Enums as dictionary keys now have the same or better performance as int dictionary keys. I measured this using NUnit:

public class EnumSpeedTest
{
    const int Iterations = 10_000_000;

    [Test]
    public void WasteTimeInt()
    {
        Dictionary<int, int> dict = new Dictionary<int, int>();
        for (int i = 0; i < Iterations; i++)
            dict[i] = i;
        long sum = 0;
        for (int i = 0; i < Iterations; i++)
            sum += dict[i];
        Console.WriteLine(sum);
    }

    enum Enum { Zero = 0, One = 1, Two = 2, Three = 3 }

    [Test]
    public void WasteTimeEnum()
    {
        Dictionary<Enum, int> dict = new Dictionary<Enum, int>();
        for (int i = 0; i < Iterations; i++)
            dict[(Enum)i] = i;
        long sum = 0;
        for (int i = 0; i < Iterations; i++)
            sum += dict[(Enum)i];
        Console.WriteLine(sum);
    }
}

The time taken by these two tests on my Ryzen 5 PC in a .NET 5.0 Release build is consistently around 300ms, and the enum version is slightly faster on most runs.

Qwertie
  • 16,354
  • 20
  • 105
  • 148
0

In later .net versions (tested for .NET7), performance is the same as using int as key. Using an IEqualityComparer struct as the Dictionary's constructor argument even make performance worse. For reference, here is some code that shows some alternatives and corresponding performances. The code uses BenchmarkDotNet framwork.

public enum MyEnum { Zero = 0, One = 1, Two = 2, Three = 3, Four = 4, Five = 5, Six = 6, Seven = 7, Eight = 8, Nine = 9, Ten = 10 }

public class DictionaryBenchmark
{
    const int count = 100;


    [Benchmark]
    public void Int()
    {
        Dictionary<int, int> dict = new Dictionary<int, int>();
        dict[0] = 0;
        dict[1] = 1;
        dict[2] = 2;
        dict[3] = 3;
        dict[4] = 4;
        dict[5] = 5;
        dict[6] = 6;
        dict[7] = 7;
        dict[8] = 8;
        dict[9] = 9;
        dict[10] = 10;

        for (int i = 0; i < count; i++)
        {
            long sum = dict[0] +
                       dict[1] +
                       dict[2] +
                       dict[3] +
                       dict[4] +
                       dict[5] +
                       dict[6] +
                       dict[7] +
                       dict[8] +
                       dict[9] +
                       dict[10];
        }

    }


    [Benchmark]
    public void Enum()
    {
        Dictionary<MyEnum, int> dict = new Dictionary<MyEnum, int>();

        dict[MyEnum.Zero] = 0;
        dict[MyEnum.One] = 1;
        dict[MyEnum.Two] = 2;
        dict[MyEnum.Three] = 3;
        dict[MyEnum.Four] = 4;
        dict[MyEnum.Five] = 5;
        dict[MyEnum.Six] = 6;
        dict[MyEnum.Seven] = 7;
        dict[MyEnum.Eight] = 8;
        dict[MyEnum.Nine] = 9;
        dict[MyEnum.Ten] = 10;

        for (int i = 0; i < count; i++)
        {
            long sum = dict[MyEnum.Zero] +
                       dict[MyEnum.One] +
                       dict[MyEnum.Two] +
                       dict[MyEnum.Three] +
                       dict[MyEnum.Four] +
                       dict[MyEnum.Five] +
                       dict[MyEnum.Six] +
                       dict[MyEnum.Seven] +
                       dict[MyEnum.Eight] +
                       dict[MyEnum.Nine] +
                       dict[MyEnum.Ten];
        }

    }

    struct MyEnumComparer : IEqualityComparer<MyEnum>
    {
        public bool Equals(MyEnum x, MyEnum y)
        {
            return x == y;
        }

        public int GetHashCode(MyEnum obj)
        {
            return (int)obj;
        }
    }

    [Benchmark]
    public void EqualityComparer()
    {
        Dictionary<MyEnum, int> dict = new Dictionary<MyEnum, int>(new MyEnumComparer());

        dict[MyEnum.Zero] = 0;
        dict[MyEnum.One] = 1;
        dict[MyEnum.Two] = 2;
        dict[MyEnum.Three] = 3;
        dict[MyEnum.Four] = 4;
        dict[MyEnum.Five] = 5;
        dict[MyEnum.Six] = 6;
        dict[MyEnum.Seven] = 7;
        dict[MyEnum.Eight] = 8;
        dict[MyEnum.Nine] = 9;
        dict[MyEnum.Ten] = 10;

        for (int i = 0; i < count; i++)
        {
            long sum = dict[MyEnum.Zero] +
                       dict[MyEnum.One] +
                       dict[MyEnum.Two] +
                       dict[MyEnum.Three] +
                       dict[MyEnum.Four] +
                       dict[MyEnum.Five] +
                       dict[MyEnum.Six] +
                       dict[MyEnum.Seven] +
                       dict[MyEnum.Eight] +
                       dict[MyEnum.Nine] +
                       dict[MyEnum.Ten];
        }
    }
    [Benchmark]
    public void Switch()
    {
        // dummy code to make benchmark more fair
        Dictionary<MyEnum, int> dict = new Dictionary<MyEnum, int>();

        dict[MyEnum.Zero] = 0;
        dict[MyEnum.One] = 1;
        dict[MyEnum.Two] = 2;
        dict[MyEnum.Three] = 3;
        dict[MyEnum.Four] = 4;
        dict[MyEnum.Five] = 5;
        dict[MyEnum.Six] = 6;
        dict[MyEnum.Seven] = 7;
        dict[MyEnum.Eight] = 8;
        dict[MyEnum.Nine] = 9;
        dict[MyEnum.Ten] = 10;
        // end of dummy code

        for (int i = 0; i < count; i++)
        {
            long sum = GetIntFromEnum(MyEnum.Zero) +
                       GetIntFromEnum(MyEnum.One) +
                       GetIntFromEnum(MyEnum.Two) +
                       GetIntFromEnum(MyEnum.Three) +
                       GetIntFromEnum(MyEnum.Four) +
                       GetIntFromEnum(MyEnum.Five) +
                       GetIntFromEnum(MyEnum.Six) +
                       GetIntFromEnum(MyEnum.Seven) +
                       GetIntFromEnum(MyEnum.Eight) +
                       GetIntFromEnum(MyEnum.Nine) +
                       GetIntFromEnum(MyEnum.Ten);
        }

    }

    private int GetIntFromEnum(MyEnum fromMyEnum)
    {
        return fromMyEnum switch
        {
            MyEnum.Zero => 0,
            MyEnum.One => 1,
            MyEnum.Two => 2,
            MyEnum.Three => 3,
            MyEnum.Four => 4,
            MyEnum.Five => 5,
            MyEnum.Six => 6,
            MyEnum.Seven => 7,
            MyEnum.Eight => 8,
            MyEnum.Nine => 9,
            MyEnum.Ten => 10,
            _ => throw new ArgumentOutOfRangeException(nameof(fromMyEnum), fromMyEnum, null)
        };
    }
    

    [Benchmark]
    public void String()
    {
        Dictionary<string, int> dict = new Dictionary<string, int>();

        dict["Zero"] = 0;
        dict["One"] = 1;
        dict["Two"] = 2;
        dict["Three"] = 3;
        dict["Four"] = 4;
        dict["Five"] = 5;
        dict["Six"] = 6;
        dict["Seven"] = 7;
        dict["Eight"] = 8;
        dict["Nine"] = 9;
        dict["Ten"] = 10;

        for (int i = 0; i < count; i++)
        {
            long sum = dict["Zero"] +
                       dict["One"] +
                       dict["Two"] +
                       dict["Three"] +
                       dict["Four"] +
                       dict["Five"] +
                       dict["Six"] +
                       dict["Seven"] +
                       dict["Eight"] +
                       dict["Nine"] +
                       dict["Ten"];
        }
    }
}

Benchmarking results:

Method Mean Error StdDev
Int 2.385 us 0.0443 us 0.0455 us
Enum 2.502 us 0.0415 us 0.0388 us
EqualityComparer 7.701 us 0.0916 us 0.0765 us
Switch 2.072 us 0.0271 us 0.0253 us
String 6.765 us 0.1316 us 0.1293 us