2

Possible Duplicate:
how to recognize similar words with difference in spelling

I am trying to get returned true while comparing these 3 strings: 'voest', 'vost' and 'vöst' (German culture), because it is the same word. (In fact, only oe and ö are the same, but e.g. for a DB collation CI it is the same which is correct, because 'vost' is a misstyped 'voest')

string.Compare(..) / string.Equals(..) returns always false no matter what arguments I provide to that method.

How to make string.Compare() / Equals(..) return true ?

Community
  • 1
  • 1
theSpyCry
  • 12,073
  • 28
  • 96
  • 152
  • And question is: how to make string.Compare return true or how i can build new method to return true for exaple showed above? – Gustav Klimt Nov 26 '12 at 09:55
  • Linguistics are very complicated. Let class `String` take care of its job (work with string contents represented by bytes) and manage meanings of your words in your logic. This is just suggestion and my opinion of course. – Leri Nov 26 '12 at 09:56
  • I do not think it is possible with string.Compare to output true for 'voest' comparing to 'vost', ever. – Mike de Klerk Nov 26 '12 at 09:58
  • Maybe you can find an answer here: http://stackoverflow.com/questions/44288/differences-in-string-compare-methods-in-c-sharp Although I dont think it's possible, because o, oe and ö aren't the same chars even if you think of different cultures. – jAC Nov 26 '12 at 09:58
  • @Mike and how would you achieve voest and vöst to be the same ? – theSpyCry Nov 26 '12 at 09:58
  • I would check for 'ö' in your string, replace it with 'oe' and 'o', then check both strings with .Equals() method. – jAC Nov 26 '12 at 10:01
  • 1
    @PaN1C_Showt1Me Create a custom class, to run whenever string.Compare returns false, to compare the strings again, but then with your logic. As I can imagine that 'oe' not always equals 'o', in different words e.g. This would be a lot of work I can imagine. – Mike de Klerk Nov 26 '12 at 10:02

2 Answers2

5

You could create a custom comparer which ignores umlauts:

class IgnoreUmlautComparer : IEqualityComparer<string>
{
    Dictionary<char, char> umlautReplacer = new Dictionary<char, char>()
    {
        {'ä','a'}, {'Ä','A'},
        {'ö','o'}, {'Ö','O'},
        {'ü','u'}, {'Ü','U'},
    };
    Dictionary<string, string> pseudoUmlautReplacer = new Dictionary<string, string>()
    {
        {"ae","a"}, {"Ae","A"},
        {"oe","o"}, {"Oe","O"},
        {"ue","u"}, {"Ue","U"},
    };

    private IEnumerable<char> ignoreUmlaut(string s)
    {
        char value;
        string replaced = new string(s.Select(c => umlautReplacer.TryGetValue(c, out value) ? value : c).ToArray());
        foreach (var kv in pseudoUmlautReplacer)
            replaced = replaced.Replace(kv.Key, kv.Value);
        return replaced;
    }

    public bool Equals(string x, string y)
    {
        var xChars = ignoreUmlaut(x);
        var yChars = ignoreUmlaut(y);
        return xChars.SequenceEqual(yChars);
    }

    public int GetHashCode(string obj)
    {
        return ignoreUmlaut(obj).GetHashCode();
    }
}

Now you can use this comparer with Enumerable methods like Distinct:

string[] allStrings = new[]{"voest","vost","vöst"};
bool allEqual = allStrings.Distinct(new IgnoreUmlautComparer()).Count() == 1;
// --> true
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
0

You could try IgnoreNonSpace option in comparing. It won't solve voest - vost, but will help with vost-vöst.

int a = new CultureInfo("de-DE").CompareInfo.Compare("vost", "vöst", CompareOptions.IgnoreNonSpace);
// a = 0; strings are equal.
Dmitrii Dovgopolyi
  • 6,231
  • 2
  • 27
  • 44
  • Did you actually test this and does it work for you? As I posted here it does not for me: http://stackoverflow.com/questions/29845211/culture-aware-string-comparison-for-umlaute/ – silent Apr 24 '15 at 11:10