0

So I've found myself writing code along these lines lately.

Dictionary<string, byte> dict = new Dictionary<string, byte>();

foreach(string str in arbitraryStringCollection)
{
  if(!dict.ContainsKey(str))
  {
    ProcessString(str);
    dict[str] = 0;
  }
}

The example is overly generic, but the common goal I find myself shooting for is "Have I done this one already?".

I like using Dictionary for the fast key lookup, but since I never care about the value field, I can't help but feel it's slightly excessive, even if it's just a byte per entry.

Is there a better .NET tool out there that accomplishes this, something with the key lookup speed of a Dictionary but without the arbitrary and unnecessary values?

Nikolay Kostov
  • 16,433
  • 23
  • 85
  • 123
  • If you never need the value in a key-value pair, why don't you take a datastructure like a `List` (duplicates) or `HashSet` (no duplicates)? – Jeroen Vannevel Mar 25 '14 at 08:37
  • 3
    Wouldn't something like [`HashSet`](http://msdn.microsoft.com/en-us/library/bb359438.aspx) work for you? – npinti Mar 25 '14 at 08:37
  • This is difficult to put into context, but I'd imagine there's something that allows the same value to be used as an input many times over. If you could fix that, you probably wouldn't even need additional checks like the one in your code sample. – Steven Liekens Mar 25 '14 at 08:47

2 Answers2

4

You should use HashSet<T>

HashSet<string> hashSet= new HashSet<string>();

foreach(string str in arbitraryStringCollection)
{
    if(!hashSet.Contains(str))
    {
        ProcessString(str);
        hashSet.Add(str);
    }
}

To make it shorter:

foreach(string str in arbitraryStringCollection)
{
    if(hashSet.Add(str)) ProcessString(str);
}
Ufuk Hacıoğulları
  • 37,978
  • 12
  • 114
  • 156
  • This, a `HashSet` is more performant than a `Dictionary`, and suits the requirements of OP better. – aevitas Mar 25 '14 at 08:46
  • The OPs question is about the amount of code they have to write - using HashSet doesn't help with that at all. – Justin Mar 25 '14 at 09:01
  • @Justin Here's a shorter version for you. I didn't downvote it but your answer assumes that OP won't need the values after the foreach loop. – Ufuk Hacıoğulları Mar 25 '14 at 09:13
0

There isn't a tool or library for that, however you can refactor this code to be less verbose. For example, the code as is could be simplified using the Distinct method.

foreach (var str in arbitraryStringCollection.Distinct())
{
    ProcessString(str)
}

You could further refactor it using some sort of ForEach extension method, or refactor the entire thing into an extension method.

Alternatively, if your requirements are slightly different (e.g. you want to keep dict for the lifetime of the application), then this could be refactored in a slightly different way, e.g.

HashSet<string> dict = new HashSet<string>();

foreach(string str in arbitraryStringCollection)
{
    dict.DoOnce(str, ProcessString);
}

// Re-usable extension method)
public static class ExtensionMethods
{
    public static void DoOnce<T>(this ISet<T> set, T value, Action<T> action)
    {
        if (!set.Contains(value))
        {
            action(value);
            set.Add(value);
        }
    }
}
Community
  • 1
  • 1
Justin
  • 84,773
  • 49
  • 224
  • 367