7

I have a question about generic collections in C#. If I need to store a collection of items, and I'm frequently going to need to check whether an item is in the collection, would it be faster to use Dictionary instead of List?

I've heard that checking if an item is in the collection is linear relative to the size for lists and constant relative to the size for dictionaries. Is using Dictionary and then setting Key and Value to the same object for each key-value pair something that other programmers frequently do in this situation?

Thanks for taking the time to read this.

Newell Clark
  • 345
  • 2
  • 13
  • 1
    How many items in the list? If you have 100, this would be pre-optimization, and it doesn't matter. – Erik Philips May 15 '12 at 20:46
  • You're using `Dioctionary` like a `HashSet`, which should technically be faster, but you should compare them using `Stopwatch` either way. – BeemerGuy May 15 '12 at 20:47
  • Duplicate. See http://stackoverflow.com/questions/1009107/what-net-collection-provides-the-fastest-search. – JamieSee May 15 '12 at 20:53

6 Answers6

5

Yes, yes it is. That said, you probably want to use HashSet because you don't need both a key and a value, you just need a set of items.

It's also worth noting that Dictionary was added in C# 2.0, and HashSet was added in 3.5, so for all that time inbetween it was actually fairly common to use a Dictionary when you wanted a Set just because that was all you had (without rolling your own). When I was forced to do this I just stuck null in the value, rather than the item as the key and value, but the idea is the same.

Servy
  • 202,030
  • 26
  • 332
  • 449
  • Thanks guys! This is just what I needed to know. I was pretty sure that there was a cleaner, neater way of doing this than using Dictionary. – Newell Clark May 15 '12 at 20:50
5

Just use HashSet<Foo> if what you're concerned with is fast containment tests.

A Dictionary<TKey, TValue> is for looking a value up based on a key.

A List<T> is for random access and dynamic growth properties.

A HashSet<T> is for modeling a set and providing fast containment tests.

You're not looking up a value based on a key. You're not worried about random access, but rather fast containment checks. The right concept here is a HashSet<T>.

jason
  • 236,483
  • 35
  • 423
  • 525
5

Assuming that there is only ever one copy of the item in the list, then the appropriate data structure is ISet<T>, specifically HashSet<T>.

That said, I've seen timing that indicate that a Dictionary<TKey, TValue> ContainsKey call is a wee bit faster than even HashSet<T>. Either way, both of them are going to be loads faster than a plain List<T> lookup.

Keep in mind that both of these methods (HashSet and Dictionary) rely on reasonably well-implemented Equals and GetHashcode implementations for T. List<T> only relies on Equals

Chris Shain
  • 50,833
  • 6
  • 93
  • 125
3

A Dictionary, or HashSet will use more memory, but provide (almost) O(1) seek time.

Dave Bish
  • 19,263
  • 7
  • 46
  • 63
2

You might want to look at HashSet, which is a collection of unique objects (as long as the object implements IEquality comparer).

saluce
  • 13,035
  • 3
  • 50
  • 67
1

You mention using List<T>, which implies that ordering may be important. If this is the case then you may also want to look into the SortedSet<T> type as well.

Andrew Hanlon
  • 7,271
  • 4
  • 33
  • 53