4

In the internal source there is such a constructor public HashSetEqualityComparer(IEqualityComparer<T> comparer) but it's internal so I can't use it.

By default, HashSet<T>.CreateSetComparer() just uses the parameterless constructor which will apply EqualityComparer<T>.Default.

Is there a way to get a HashSetEqualityComparer<T> with a IEqualityComparer<T> of choice, without copying out the code from the source?

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
David S.
  • 5,965
  • 2
  • 40
  • 77
  • 2
    Through the constructor of the hashset? – Johnny Feb 17 '18 at 14:44
  • @Lepijohnny what do you mean? I am not trying to construct a hashset, I'm trying to construct a `HashSetEqualityComparer` – David S. Feb 17 '18 at 14:47
  • @DavidS. `new HashSet(MyEqualityComparer)` – Eser Feb 17 '18 at 14:47
  • That class is internal and by the way doesnt have anything special it at the end uses the IEquilityComparer. For what reason you need that class? – Johnny Feb 17 '18 at 14:50
  • @Lepijohnny I feel you both are misunderstanding.. yes it is special, it makes possible stuff like `Dictionary,Tv>` which compares the contents of hashsets on lookup. Basically something like `HashSet.CreateSetComparer(IEqualityComparer comparer)` would be what I need. Note it is a comparer for the sets themselves, not the elements – David S. Feb 17 '18 at 14:54
  • @Lepijohnny is correct, create your own class that implements `IEqualityComparer<>` and pass that to the constructor of the `HashSet` – Ric Feb 17 '18 at 14:54
  • It seems your best bet would be to copy the source code, fortunately it is very small and simple. – Evk Feb 17 '18 at 14:59
  • @DavidS *I feel you both are misunderstanding.. yes it is special, it makes possible stuff like Dictionary,Tv> which compares the contents of hashsets on lookup*... whats stopping you from specifying a custom `IEqualityComparer>` in the dictionary constructor? – InBetween Feb 17 '18 at 14:59
  • @InBetween Nothing but I need to create a `IEqualityComparer>` first, but the only way I know is to use `HashSet.CreateSetComparer()`, where I can't specify `IEqualityComparer` – David S. Feb 17 '18 at 15:04
  • @DavidS. You have the reference code of `HashSetEqualityComparer`, implementing yours that takes a custom `IEqualityComparer` takes about 1 minute. – InBetween Feb 17 '18 at 15:05
  • @InBetween @Evk yes I know it's easy to make one with the reference code, and with this question I was hoping that the framework already provided this, seems like a hole in the framework that `CreateSetComparer` doesn't have an overload for a comparer of `T`. – David S. Feb 17 '18 at 15:08
  • Ah now is clear it was misunderstood. Thanks InBetween... – Johnny Feb 17 '18 at 15:11
  • 1
    Such a method will never be useful. Do keep in mind that it compares the entirely collections. The hashsets themselves already define how their elements must be compared. – Hans Passant Feb 17 '18 at 15:29
  • @HansPassant Yes it's useful, consider having two hashsets, each with a single element for example sake: `set1 = new HashSet(StringComparer.OrdinalIgnoreCase) {"a"}` and `set2 = new HashSet(StringComparer.OrdinalIgnoreCase) {"A"}`. Now `set1.SetEquals(set2)` will return `true` because of the ignorecase comparer. BUT create a `var x = new HashSet>(HashSet.CreateSetComparer()) {set1}`, and then run `x.Contains(set2)`, it will return `false` because it's usign default string comparer. Now the only way to get this to return true would be to have what I was talking about. – David S. Feb 17 '18 at 15:50
  • That is just a bug, use the set1.Comparer property to give it the correct comparer. – Hans Passant Feb 17 '18 at 16:00
  • @HansPassant To give what the correct comparer? Once again please note I need `IEqualityComparer>`. – David S. Feb 17 '18 at 16:16

4 Answers4

6

I think best solution is using SetEquals. It does the job you need and exactly in the same way that HashSetEqualityComparer does but it will account for any custom comparers defined in the sets its comparing.

So, in your specific scenario where you want to use a HashSet<T> as a key of a dictionary, you need to implement an IEqualityComparer<HashSet<T>> that makes use of SetEquals and "borrows" the reference source of HashSetEqualityComparer.GetHashCode():

public class CustomHashSetEqualityComparer<T>
    : IEqualityComparer<HashSet<T>>
{
    public bool Equals(HashSet<T> x, HashSet<T> y)
    {
        if (ReferenceEquals(x, null))
            return false;

        return x.SetEquals(y);
    }

    public int GetHashCode(HashSet<T> set)
    {
        int hashCode = 0;

        if (set != null)
        {
            foreach (T t in set)
            {
                hashCode = hashCode ^ 
                    (set.Comparer.GetHashCode(t) & 0x7FFFFFFF);
            }
        }

        return hashCode;
    }
}

But yes, its a small pain that there is not way to directly create a SetEqualityComparer that leverages custom comparers but this unfortunate behavior is due, IMHO, more to a bug of the existing implementation than a lack of the needed overload; there is no reason why CreateSetComparer() can't return an IEqualityComparer that actually uses the comparers of the sets its comparing as the code above demonstrates.

If I had a voice in it, CreateSetComparer() wouldn't be static method at all. It would then be obvious, or at least predictable, that whatever comparer was returned would be created with the current set's comparer.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
InBetween
  • 32,319
  • 3
  • 50
  • 90
  • The `x.SetEquals(y)` throws an exception in case the `y` is `null`. Also I am not sure what's the purpose of the `& 0x7FFFFFFF`. Negative hashcode values are perfectly valid. – Theodor Zoulias Jul 16 '23 at 18:08
2

I agree @InBetween, using SetEquals is the best way. Even if add the constructor still can not achieve what you want.

Please see this code: http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360

Here is I try to do:

class HashSetEqualityComparerWrapper<T> : IEqualityComparer<HashSet<T>>
{
    static private Type HashSetEqualityComparerType = HashSet<T>.CreateSetComparer().GetType();
    private IEqualityComparer<HashSet<T>> _comparer;

    public HashSetEqualityComparerWrapper()
    {
        _comparer = HashSet<T>.CreateSetComparer();
    }
    public HashSetEqualityComparerWrapper(IEqualityComparer<T> comparer)
    {
        _comparer = HashSet<T>.CreateSetComparer();
        if (comparer != null)
        {
            FieldInfo m_comparer_field = HashSetEqualityComparerType.GetField("m_comparer", BindingFlags.NonPublic | BindingFlags.Instance);
            m_comparer_field.SetValue(_comparer, comparer);
        }
    }

    public bool Equals(HashSet<T> x, HashSet<T> y)
    {
        return _comparer.Equals(x, y);
    }
    public int GetHashCode(HashSet<T> obj)
    {
        return _comparer.GetHashCode(obj);
    }
}

UPDATE

I took 5 mins to implement another version form HashSetEqualityComparer<T> source code. And rewrite the bool Equals(HashSet<T> x, HashSet<T> y) method. It is not complex. All code just copy and paste from source, I just revise a bit.

class CustomHashSetEqualityComparer<T> : IEqualityComparer<HashSet<T>>
{
    private IEqualityComparer<T> m_comparer;

    public CustomHashSetEqualityComparer()
    {
        m_comparer = EqualityComparer<T>.Default;
    }

    public CustomHashSetEqualityComparer(IEqualityComparer<T> comparer)
    {
        if (comparer == null)
        {
            m_comparer = EqualityComparer<T>.Default;
        }
        else
        {
            m_comparer = comparer;
        }
    }

    // using m_comparer to keep equals properties in tact; don't want to choose one of the comparers
    public bool Equals(HashSet<T> x, HashSet<T> y)
    {
        // http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360
        // handle null cases first
        if (x == null)
        {
            return (y == null);
        }
        else if (y == null)
        {
            // set1 != null
            return false;
        }

        // all comparers are the same; this is faster
        if (AreEqualityComparersEqual(x, y))
        {
            if (x.Count != y.Count)
            {
                return false;
            }
        }
        // n^2 search because items are hashed according to their respective ECs
        foreach (T set2Item in y)
        {
            bool found = false;
            foreach (T set1Item in x)
            {
                if (m_comparer.Equals(set2Item, set1Item))
                {
                    found = true;
                    break;
                }
            }
            if (!found)
            {
                return false;
            }
        }
        return true;
    }

    public int GetHashCode(HashSet<T> obj)
    {
        int hashCode = 0;
        if (obj != null)
        {
            foreach (T t in obj)
            {
                hashCode = hashCode ^ (m_comparer.GetHashCode(t) & 0x7FFFFFFF);
            }
        } // else returns hashcode of 0 for null hashsets
        return hashCode;
    }

    // Equals method for the comparer itself. 
    public override bool Equals(Object obj)
    {
        CustomHashSetEqualityComparer<T> comparer = obj as CustomHashSetEqualityComparer<T>;
        if (comparer == null)
        {
            return false;
        }
        return (this.m_comparer == comparer.m_comparer);
    }

    public override int GetHashCode()
    {
        return m_comparer.GetHashCode();
    }

    static private bool AreEqualityComparersEqual(HashSet<T> set1, HashSet<T> set2)
    {
        return set1.Comparer.Equals(set2.Comparer);
    }
}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Wilhelm Liao
  • 829
  • 5
  • 12
2

Avoid this class if you use custom comparers. It uses its own equality comparer to perform GetHashCode, but when performing Equals(Set1, Set2) if Set1 and Set2 have the same equality comparer, the the HashSetEqualityComparer will use the comparer of the sets. HashsetEqualityComparer will only use its own comparer for equals if Set1 and Set2 have different comparers

It gets worse. It calls HashSet.HashSetEquals, which has a bug in it (See https://referencesource.microsoft.com/#system.core/System/Collections/Generic/HashSet.cs line 1489, which is missing a if (set1.Count != set2.Count) return false before performing the subset check.

The bug is illustrated by the following program:

class Program
{
    private class MyEqualityComparer : EqualityComparer<int>
    {
        public override bool Equals(int x, int y)
        {
            return x == y;
        }

        public override int GetHashCode(int obj)
        {
            return obj.GetHashCode();
        }
    }

    static void Main(string[] args)
    {
        var comparer = HashSet<int>.CreateSetComparer();
        var set1 = new HashSet<int>(new MyEqualityComparer()) { 1 };
        var set2 = new HashSet<int> { 1, 2 };

        Console.WriteLine(comparer.Equals(set1, set2));
        Console.WriteLine(comparer.Equals(set2, set1)); //True!

        Console.ReadKey();
    }
}

Regarding other answers to this question (I don't have the rep to comment):

  • Wilhelm Liao: His answer also contains the bug because it's copied from the reference source
  • InBetween: The solution is not symmetric. CustomHashSetEqualityComparer.Equals(A, B) does not always equals CustomHashSetEqualityComparer.Equals(B, A). I would be scared of that.

I think a robust implementation should throw an exception if it encounters a set which has a different comparer to its own. It could always use its own comparer and ignore the set comparer, but that would give strange and unintuitive behaviour.

Barry Smith
  • 131
  • 8
-1

Additional to the original solution, we can simplify GetHashCode with HashCode.Combine function:

public int GetHashCode(HashSet<T> set)
{
    int hashCode = 0;
    foreach (var item in set)
    {
        hashCode ^= HashCode.Combine(item);
    }

    return hashCode;
}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Yerkon
  • 4,548
  • 1
  • 18
  • 30
  • The [`HashCode.Combine`](https://learn.microsoft.com/en-us/dotnet/api/system.hashcode.combine#system-hashcode-combine-1(-0)) with a single argument diffuses the hash code across a large range. It improves the quality of the hash code when the underlying data type is simple, for example, an integer value. So your suggestion can produce a higher quality hashcode, but unfortunately you bypassed the `set.Comparer`, so your suggestion is flawed. It can produce different hashcodes for hashsets that are equal according to their non-default comparer. Which is against the rules. – Theodor Zoulias Jul 16 '23 at 18:24