6

I have a collection of items called RegisteredItems. I do not care about the order of the items in RegisteredItems, only that they exist.

I perform two types of operations on RegisteredItems:

  • Find and return item by property.
  • Iterate over collection and have side-effect.

According to: When should I use the HashSet<T> type? Robert R. says,

"It's somewhat dangerous to iterate over a HashSet because doing so imposes an order on the items in the set. That order is not really a property of the set. You should not rely on it. If ordering of the items in a collection is important to you, that collection isn't a set."

There are some scenarios where my collection would contain 50-100 items. I realize this is not a large amount of items, but I was still hoping to reap the rewards of using a HashSet instead of List.

I have found myself looking at the following code and wondering what to do:

LayoutManager.Instance.RegisteredItems.ToList().ForEach( item => item.DoStuff() );

vs

foreach( var item in LayoutManager.Instance.RegisteredItems)
{
    item.DoStuff();
}

RegisteredItems used to return an IList<T>, but now it returns a HashSet. I felt that, if I was using HashSet for efficiency, it would be improper to cast it as a List. Yet, the above quote from Robert leaves me feeling uneasy about iterating over it, as well.

What's the right call in this scenario? Thanks

Community
  • 1
  • 1
Sean Anderson
  • 27,963
  • 30
  • 126
  • 237
  • 3
    I think he's just saying that if you're iterating over a set, you might care about the order of the iteration, and if you do care you shouldn't use a set. – George Duckett Nov 02 '11 at 16:51
  • 1
    ToList() also enumerates your collection. – H H Nov 02 '11 at 16:53
  • Okay, cool. Is one allowed to remove an item from a HashSet while iterating over it? Or is that an illegal operation? Just trying to understand how the whole ordering of a hash set works.. – Sean Anderson Nov 02 '11 at 16:54
  • That would cause an exception. – George Duckett Nov 02 '11 at 16:55
  • It's not dangerous in terms of efficiency, but in what you think it would "behave" like. Because of that a `HashSet` is not an `IList`, but only an `ICollection`. You can of course iterate over it, but it does not guarantee an order! But in your case that may be required if `RegisteredItems` should be an `IList` ... – ordag Nov 02 '11 at 16:56

3 Answers3

10

If you don't care about order, use a HashSet<>. The quote is about using HashSet<> being dangerous when you're worried about order. If you run this code multiple times, and the items are operated on in different order, will you care? If not, then you're fine. If yes, then don't use a HashSet<>. Arbitrarily converting to a List first doesn't really solve the problem.

And I'm not certain, but I suspect that .ToList() will iterate over the HashSet<> to do that, so, now you're walking the collection twice.

Don't prematurely optimize. If you only have 100 items, just use a HashSet<> and move on. If you start caring about order, change it to a List<> then and use it as a list everwhere.

i_am_jorf
  • 53,608
  • 15
  • 131
  • 222
  • I suspect that .ToList() does this, as well. :) I just thought it might (necessarily?) enumerate the HashSet so that when I call item.DoStuff() any side-effects won't affect the 'order' of the HashSet without throwing an IllegalOperationException? – Sean Anderson Nov 02 '11 at 16:53
  • 2
    Well, you can't modify a collection of any sort while you're iterating through it. That's bad. It wasn't clear that was your worry from your original question. You'll have to explain the side-effect in detail before one can offer advice on that. I suggest starting a new question phrased as "I'm iterating through a HashSet, performing an operation that has FOO side effect and I'm getting this exception. How do I do this correctly?" – i_am_jorf Nov 02 '11 at 16:54
  • Completely understand that it's bad, but, as a quick example. To remove items while iterating over them in a list, I use a for loop and set the index to the last item in the list and then go backwards through the list -- removing items as I go. This works perfecly fine and is standard. But, I can't access a hashset by index. Which leaves me wondering if I have to cast it to a List to be able to have this functionality. I don't actually have the need for this now, though, was just thinking about it. :) – Sean Anderson Nov 02 '11 at 16:58
  • You should instead build a list of items you want to delete as your making your way through the HashSet<>, then perform a second iteration of your to-delete list where you delete from the HashSet<>. That is standard. I frown upon doing the reverse iteration as it's just a non-standard sort of iteration that makes code maintenance harder and reduces readability for others; others will most likely disagree with me. – i_am_jorf Nov 02 '11 at 17:02
  • Heh. I spent a good 20-30 on SO yesterday discussing the merits of both :p They're probably both OK.. I think for a hashset your way is definitely better -- but for a list accessing by index could be argued as better. The second round of look-ups aren't trivial when working on a list (as opposed to the near-constant look-up time of the hash set) – Sean Anderson Nov 02 '11 at 17:08
2

If you really don't care about order and you know that you can't have duplicate in your hashset (and it's what you want), go ahead use hashset.

Toto
  • 149
  • 2
2

In the quoted question, I think he's saying that if you iterate over a Set, you can easily trick yourself into thinking that the items are in a certain order. For example, it'd be easy to treat the first iterated item differently, but you aren't guaranteed that will remain the first iterated item.

As long as you keep this in mind, and consider the Set unordered, iterating over it is fine.

David Yaw
  • 27,383
  • 4
  • 60
  • 93