3

My code is like this:

public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
    {
        public bool Equals(string x, string y)
        {
            return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
        }

        public int GetHashCode(string obj)
        {
             // not sure what to put here
        }
    }

I know the role of GetHashCode in this context, what I'm missing is how to produce the InvariantCulture, IgnoreNonSpace and IgnoreCase version of obj so that I can return it's HashCode.

I could remove diacritics and the case from obj myself and then return it's hashcode, but I wonder if there's a better alternative.

Andre Pena
  • 56,650
  • 48
  • 196
  • 243
  • Are you going to use this with a dictionary, `HashSet`, or anything else that uses hash tables? – Rawling Oct 22 '12 at 19:14
  • What is the purpose of this class? Could you provide an example of syntax this would allow you to avoid? – paparazzo Oct 22 '12 at 20:00

2 Answers2

5

Returning 0 inside GetHashCode() works (as pointed out by @Michael Perrenoud) because Dictionaries and HashMaps call Equals() just if GetHashCode() for two objects return the same values.
The rule is, GetHashCode() must return the same value if objects are equal.
The drawback is that the HashSet (or Dictionary) performance decreases to the point it becomes the same as using a List. To find an item it has to call Equals() for each comparison.
A faster approach would be converting to Accent Insensitive string and getting its hashcode.

Code to remove accent (diacritics) from this post

static string RemoveDiacritics(string text)
{
    return string.Concat(
        text.Normalize(NormalizationForm.FormD)
        .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
                                        UnicodeCategory.NonSpacingMark)
    ).Normalize(NormalizationForm.FormC);
}

Comparer code:

public class CaseAccentInsensitiveEqualityComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        return string.Compare(x, y, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0;
    }

    public int GetHashCode(string obj)
    {
        return obj != null ? RemoveDiacritics(obj).ToUpperInvariant().GetHashCode() : 0;
    }

    private string RemoveDiacritics(string text)
    {
        return string.Concat(
            text.Normalize(NormalizationForm.FormD)
            .Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) !=
                                          UnicodeCategory.NonSpacingMark)
          ).Normalize(NormalizationForm.FormC);
    }
}
Community
  • 1
  • 1
Mitsugui
  • 66
  • 1
  • 5
-1

Ah, excuse me, I had my methods mixed up. When I implemented something like this before I just returned the hash code of the object itself return obj.GetHashCode(); so that it would always enter the Equals method.


Okay, after much confusion I believe I've got myself straight. I found that returning zero, always, will force the comparer to use the Equals method. I'm looking for the code I implemented this in to prove that and put it up here.


Here's the code to prove it.

class MyArrayComparer : EqualityComparer<object[]>
{
    public override bool Equals(object[] x, object[] y)
    {
        if (x.Length != y.Length) { return false; }
        for (int i = 0; i < x.Length; i++)
        {
            if (!x[i].Equals(y[i]))
            {
                return false;
            }
        }
        return true;
    }

    public override int GetHashCode(object[] obj)
    {
        return 0;
    }
}
Mike Perrenoud
  • 66,820
  • 29
  • 157
  • 232
  • I'm not the downvoter, but I think that, when you pass an IEqualityComparer to a Dictionary or HashMap, the dictionary will use the IEqualityComparer to determine the Hash for their items (as it can't use the usual GetHashCode as the objects are not being compared normally). Returning a zero number will confuse these dictionaries. – Andre Pena Oct 22 '12 at 18:49
  • @AndréPena, I'm getting myself confused. Return zero, always, and that will force it to use the `Equals` method. – Mike Perrenoud Oct 22 '12 at 18:50