This happens because lowercase İ
("latin capital letter
i with dot above") in English locales turn into the two characters: "latin small letter i
" and "combining dot above".
This explains why it starts with i
, but doesnt end with i
(it ends with a combining diacritic mark instead).
In a Turkish locale, lowercase İ
simply becomes "latin small letter i
" in accordance with Turkish linguistics rules, and your code would therefore work.
Here's a test program to help figure out what's going on:
class Test {
public static void main(String[] args) {
char[] foo = args[0].toLowerCase().toCharArray();
System.out.print("Lowercase " + args[0] + " has " + foo.length + " chars: ");
for(int i=0; i<foo.length; i++) System.out.print("0x" + Integer.toString((int)foo[i], 16) + " ");
System.out.println();
}
}
Here's what we get when we run it on a system configured for English:
$ LC_ALL=en_US.utf8 java Test "İ"
Lowercase İ has 2 chars: 0x69 0x307
Here's what we get when we run it on a system configured for Turkish:
$ LC_ALL=tr_TR.utf8 java Test "İ"
Lowercase İ has 1 chars: 0x69
This is even the specific example used by the API docs for String.toLowerCase(Locale), which is the method you can use to get the lowercase version in a specific locale, rather than the system default locale.