4

Disclaimer: Maybe be micro-YAGNI-optimizing but hear me out ..

The problem is to implement a case-insensitive lookup table.

  • My old-skool way: While populating the dictionary, upper-case the key before inserting. Upper-case the key when someone gives you a key to lookup.
  • The new way (I learned about it today): Dictionary takes in an IComparer implementation, so I could pass in StringComparer.InvariantCultureIgnoreCase. I think it would delegate to String.Compare(x, y, SomeIgnoreCaseEnum)

The new way has an advantage in that I don't need to ensure that a .ToUpper() is performed at each of the n places where a lookup is done against the dictionary .

My question is which one is more efficient ? Just curious I guess...

Update: Note I don't need to know the original key that was inserted. Also the keys that are used are culture agnostic.

Gishu
  • 134,492
  • 47
  • 225
  • 308
  • 1
    Write a micro-benchmark of your own (run each type 100,000 times, timing it 3 times to account for variable conditions). – Oded Apr 22 '10 at 08:55
  • One point to keep in mind. If you need the keys in their original form at some point (e.g. for printing) the first option is not optimal. – Brian Rasmussen Apr 22 '10 at 09:07
  • @Brian - Yeah I forgot to put that in. My use-case is case-insensitive and culture-agnostic. – Gishu Apr 22 '10 at 09:15
  • 1
    Being culture-agnostic, I would use `StringComparer.OrdinalIgnoreCase`. – João Angelo Apr 22 '10 at 11:31
  • Also check out this article, which shows that using a Comparer when creating a Dictionary object boosts performance. http://www.dotnetperls.com/dictionary-stringcomparer – ingredient_15939 Jul 29 '12 at 15:33

4 Answers4

3

It's possible that upper-casing would be more efficient, as it could then do an ordinal comparison... but I very much doubt that this is a performance bottleneck for you. As ever, profile before you commit to a code change on the basis of performance.

Ultimately, specifying the string comparison:

  • Means you don't need to be as careful about how you use the dictionary
  • Means the original casing is maintained in the key, which can help diagnostically in some cases
  • Explicitly says up-front how you want the keys to be handled. You end up stating it just once, at the point of creation - which leads to clearer code IMO.
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
1

Check this entry. It's still valid today.

Excerpt: From MSDN's "New Recommendations for Using Strings in Microsoft .NET 2.0"

Community
  • 1
  • 1
ntziolis
  • 10,091
  • 1
  • 34
  • 50
  • The 4th and 5th DO on that page confuse me - what are they recommending ? What does 'linguistically relevant' mean ? – Gishu Apr 22 '10 at 09:35
  • They explain it in detail a little further down: http://msdn.microsoft.com/en-us/library/ms973919.aspx#stringsinnet20_topic5 – ntziolis Apr 22 '10 at 10:21
1

I don't know about the performance but I prefer the StringComparer option. With ToUpper you lose information (the original casing). True you might not need it, but one day you might and it doesn't feel like any more work to keep it (thus safe from the YAGNI principle).

I would also one day forget to call ToUpper and be in a world of hurt. But my unit tests would save me of course.

Mike Two
  • 44,935
  • 9
  • 80
  • 96
  • exactly. I have unit tests - so forgetting it in one place isn't gonna hurt for too long. Only till the next test run. – Gishu Apr 22 '10 at 09:14
0

I'd have accepted Oded's 'Go profile!' comment :) which was a Master-of-the-obvious suggestion.

From my test (source code here)

For 1M ContainsKey lookups,

  • ToUpper approach : 1123 msecs
  • Dictionary(OrdinalIgnoreCaseComparer) : 971 msecs

I find Dictionary with an injected Comparer option to perform better and less hassle. So no more ToUpper() for me.

Gishu
  • 134,492
  • 47
  • 225
  • 308