8

I was trying to compare two chars ignoring the case and came up with the following disturbing code trying to play with the now famous turkish i :

char lowerCase = 'ı'; // U+0131 
char upperCase = 'I'; // regular upper i

// Displays True comparing the chars
Trace.WriteLine(char.ToUpper(lowerCase, CultureInfo.CurrentCulture) == char.ToUpper(upperCase, CultureInfo.CurrentCulture));

// Displays False comparing the strings
Trace.WriteLine(lowerCase.ToString().Equals(upperCase.ToString(), StringComparison.CurrentCultureIgnoreCase));

The two things are using my culture (french) and seems to do the same thing but the result is not what i expected (either both True or both False).

When using the turkish culture with :

Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");

The result changes and both return True.

Do i miss something about Culture or char/string comparison ?

Edit

Reading this question, I now understand the two are different. It has nothing to do with comparing chars or string but with handling myself the case or not. What I don't understand is why they are different. Isn't the string.Equals with a CurrentCultureIgnoreCase supposed to compare upper case versions of the strings as i do ? What is behind the scene that i can't see ?

Community
  • 1
  • 1
Ted
  • 173
  • 7
  • As far as I remember, in Turkish, there are 2 different "i". – Graffito Jan 16 '17 at 21:57
  • 1
    The answer to [String comparison - strA.ToLower()==strB.ToLower() or strA.Equals(strB,StringComparisonType)?](http://stackoverflow.com/q/1660192/1324033) may help you – Sayse Jan 16 '17 at 21:57
  • 1
    Possible duplicate of [String comparison - strA.ToLower()==strB.ToLower() or strA.Equals(strB,StringComparisonType)?](http://stackoverflow.com/questions/1660192/string-comparison-stra-tolower-strb-tolower-or-stra-equalsstrb-stringcom) – Heretic Monkey Jan 16 '17 at 21:58
  • 2
    The linked question isn't a duplicate of this one. The OP is asking why, *under French locale rules*, uppercasing the characters and comparing them results in equality, but comparing those characters as strings while ignoring case does not. The question is not why this does work in Turkish, but why the results could be different between character and string comparison in *any* culture, which is a little more subtle. None of the answers to the linked question attempt to answer that, beyond the Skeetmeister saying that "case comparisons aren't as simple as you might expect". (Which is true.) – Jeroen Mostert Jan 17 '17 at 10:01
  • Consider the following: *lowercasing* `I` results in `i` in French, not `ı`, so if the case-insensitive string comparison yielded `True`, this would be incorrect if you compared *lowercase* versions of the strings. You're complaining it's incorrect with a comparison after *uppercasing* the strings, but in truth, a case-insensitive string comparison probably doesn't necessarily do either. (I haven't delved closely into the algorithms, or the rules Unicode specifies.) – Jeroen Mostert Jan 17 '17 at 10:06

2 Answers2

0

I am guessing that the case insensitive comparison might be working on lower case letters.

With this assumption, the first comparison is 'ı'.ToUpper* = 'I' vs 'I'.ToUpper = 'I' => true. Second comparison is "ı".ToLower = "ı" vs "I".ToLower = "i" => false.

To test, you should get the same surprising result if you test upper case comparison vs lower case comparison vs case insensitive comparison, using only strings and only chars. You could further analyze starting with 'ı' and 'I' vs 'i' and 'İ'.

*: This is the crucial bit, I am assuming that given French does not have the letter 'ı', it is assumed that its upper case version is 'I' as in Turkish.

Can Bud
  • 142
  • 1
  • 11
0

There is no explicit ToUpper or ToLower done inside that Equals check.

My guess is that a check is done whether there are casing rules that can change one into the other and vice versa.

Both turkish and french can uppercase an "ı" into an "I". But only turkish can perform the opposite conversion, so only there those characters are considered "equal ignoring case".

Hans Kesting
  • 38,117
  • 9
  • 79
  • 111