86

Does the HashSet collection introduced in .NET 3.5 preserve insertion order when iterated using foreach?

The documentation states, that the collection is not sorted, but it doesn't say anything about insertion order. A pre-release BCL blog entry states that it is unordered, but this article states that it is designed to preserve insertion order. My limited testing suggests, that order is preserved, but that could be a coincidence.

Colonel Panic
  • 132,665
  • 89
  • 401
  • 465
Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317

7 Answers7

93

This HashSet MSDN page specifically says:

A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 6
    HashSet implies it's based on a hash table. Hash table order depends primarily on the hashcodes of items in the set, not on insertion order. – Qwertie Oct 19 '09 at 18:44
  • Agree. See Jon Skeet's answer for counterexample. Here is a related question asking about an implementation of such a HashTable - if you want to guaranteed preserve insertion order. http://stackoverflow.com/questions/1552225/hashset-that-preserves-ordering/17853085#17853085 – George Mamaladze Jul 25 '13 at 08:48
  • @BrianRasmussen haha...just read that in MSDN and navigated here just in case...+1 for not wasting any more of my time – J.S. Orris Oct 27 '15 at 23:53
  • answer the question! – OuuGiii May 31 '19 at 14:33
  • "whose elements are in no particular order" – Michael Burr Jul 01 '19 at 13:51
52

I think the article claiming it preserves ordering is just plain wrong. For simple tests the insertion order may well be preserved due to the internal structure, but it's not guaranteed and won't always work that way. I'll try to come up with a counterexample.

EDIT: Here's the counterexample:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        var set = new HashSet<int>();

        set.Add(1);
        set.Add(2);
        set.Add(3);
        set.Remove(2);
        set.Add(4);


        foreach (int x in set)
        {
            Console.WriteLine(x);
        }
    }
}

This prints 1, 4, 3 despite 3 having been inserted before 4.

It's possible that if you never remove any items, it will preserve insertion order. I'm not sure, but I wouldn't be entirely surprised. However, I think it would be a very bad idea to rely on that:

  • It's not documented to work that way, and the documentation explicitly states that it's not sorted.
  • I haven't looked at the internal structures or source code (which I don't have, obviously) - I'd have to study them carefully before making any such claim in a firm manner.
  • The implementation could very easily change between versions of the framework. Relying on this would be like relying on the string.GetHashCode implementation not changing - which some people did back in the .NET 1.1 days, and then they got burned when the implementation did change in .NET 2.0...
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • That is my assumption as well. Unfortunately other articles are claiming the same (based on said article). It would be nice to have a definitive yes/no answer from a reliable source. – Brian Rasmussen Mar 18 '09 at 07:57
  • I'm somewhat appalled by the amount of misinformation about this despite the official documentation. Also found this page http://ezinearticles.com/?C-HashSet-Advantages&id=1761474 that was also high in google searches. What's worse about that one is that it specifically recognizes that there are 2 different types of set implementations: those that do and don't preserve order, but it specifically claims that in .NET a HashSet does preserve order. – Davy8 Apr 27 '10 at 23:39
  • `foreach` does not iterate in order. Always use 'for' and indexes. – Mihai Bratulescu Apr 11 '19 at 13:07
  • 1
    @MihaiBratulescu: It iterates in whatever order the calls to `MoveNext` returns. For every ordered type I'm aware of, that will be the same order as using the index. Note that in the type in question (`HashSet`) there *is* no indexer. Could you give a concrete example where you believe it's better to use an index than a foreach loop? – Jon Skeet Apr 11 '19 at 13:12
7

The documentation states:

A HashSet<(Of <(T>)>) collection is not sorted and cannot contain duplicate elements. If order or element duplication is more important than performance for your application, consider using the List<(Of <(T>)>) class together with the Sort method.

Therefore it doesn't matter whether it actually preserves the order of elements in the current implementation, because it is not documented as doing so, and even if it appears to now this may change at any point in the future (even in a hotfix to the framework).

You should be programming against documented contracts, not implementation details.

Greg Beech
  • 133,383
  • 43
  • 204
  • 250
  • I agree, but I didn't think the quote above would be sufficient to get the message across. I was pretty sure, that the set would be unordered, I was just looking for some clear documentation. – Brian Rasmussen Mar 18 '09 at 08:08
4

Reading the source code for HashSet.AddIfNotPresent you can see insertion order is preserved assuming there haven't been any deletions.

Thus new HashSet<string> { "Tom", "Dick", "Harry" } preserves order, but if you then remove Dick and add Rick, the order will be ["Tom", "Rick", "Harry"].

Colonel Panic
  • 132,665
  • 89
  • 401
  • 465
  • Agreed. So long as you never remove an item, they'll enumerate in insertion order. This can be a useful property, and is not something that will ever be removed from the class, even if not documented. The team just wouldn't risk breaking applications that have come to depend upon this behaviour. – Drew Noakes Dec 16 '19 at 11:46
3

There is specifically a SortedSet<T> collection in .NET4.

This would give you sorting, but unlikely to be insertion order sorting. Since you can use a custom IComparer you could theoretically make this do anything.

Chris Marisic
  • 32,487
  • 24
  • 164
  • 258
2

No, a hash set won't preserve insertion order, at least not predictably. You could use a LinkedHashSet (Java), or an equivalent. A LinkedHashSet will preserve order.

If you want order, you shouldn't even be using a set in the first place... its not made for ordered elements, except in exceptional cases.

EDIT: sounds like I'm preaching :-/ Sorry.

Sudhir Jonathan
  • 16,998
  • 13
  • 66
  • 90
0

You could use a HashSet that does not preserve order together in parallel with a List that does preserve insertion/deletion order to obtain a List where every item is both unique and in insertion/deletion order.

You would initialize the HashSet and List together or clear them together. When adding a value you would use the HashSet to optimally test for whether a value has previously been added to the HashSet. If it hasn't been added to the HashSet, then add the value to both the HashSet and the List.

If an item is to be removed you would use the HashSet to efficiently test whether that item exists in the HashSet first. If it does exist you would remove it from both the HashSet and the List.

From the first deletion on the HashSet will be unique, but if any more items are added or removed it will no longer be in insertion/deletion order. However, because the HashSet ensured that you only ever added or remomed any item once in the List in the order that each item was encountered, the List will be both unique and in insertion/deletion order.