14

Can someone explain this behavior?

" ".EndsWith(((char)9917).ToString()) // returns true

StartsWith works same.

  • 1
    For everyone who's wondering: This character is the Socker Ball. http://www.fileformat.info/info/unicode/char/26bd/index.htm – usr Feb 09 '14 at 19:45
  • 1
    According to the culture info, that character is equivalent to an empty string. `string.Equals(((char)9917).ToString, "", StringComparison.CurrentCulture)` also returns `true`. – Lasse V. Karlsen Feb 09 '14 at 19:49
  • Please check this post http://stackoverflow.com/questions/11467424/somestring-indexofsomestring-returns-1-instead-of-0-under-net-4 also – andrewpey Feb 09 '14 at 19:55
  • 2
    Also returns true for 9918 and 9919, but not 9916 or 9920. – Blorgbeard Feb 09 '14 at 19:57
  • 1
    Funny: `" ".Contains(((char)9917).ToString()) //false` :-) – dognose Feb 09 '14 at 20:04
  • @usr And it says "Unicode 5.2.0", which quite possibly means it's too new for Windows to recognise. –  Feb 09 '14 at 20:05
  • 1
    "a".EndsWith(((char)9917).ToString()) also returns true. As @LasseV.Karlsen points out that character equals an empty string so it is the same as 'string.EndsWith("")'. – Doug Feb 09 '14 at 20:05
  • @hvd does that mean they are still adding these stupid characters?! I always thought they were a practical joke from the early days of Unicode. – usr Feb 09 '14 at 20:07
  • @usr It seems so. The other characters mentioned here are Unicode 5.1 characters, and Windows 7 supports that, which explains why they work. –  Feb 09 '14 at 20:08
  • 1
    I'm literally waiting for a ManInABlackHatThatRidesAThreeleggedHorseWithOneShoe Unicode character. (btw: have fun: https://plus.google.com/109925364564856140495/posts) – quetzalcoatl Feb 09 '14 at 20:17

2 Answers2

3

.NET Framework 4 on Windows 7 includes support for Unicode 5.1:

The culture-sensitive sorting and casing rules used in string comparison depend on the version of the .NET Framework. In the .NET Framework 4, sorting, casing, normalization, and Unicode character information is synchronized with Windows 7 and conforms to the Unicode 5.1 standard.

The character you're using is a Unicode 5.2 character, so it's likely to not behave correctly in any function other than those that compare characters by number only.

You should see different behaviour (but I cannot test it right now) on Windows 8, and .NET 4.5: according to the documentation, in that case, Unicode 6.0 is supported. According to Thomas Levesque in the comments, contrary to the documentation, this has not been changed in later versions.

2

As mentioned in the comments, the endswith Method uses the current Culture, if no StringComparison Type is provided.

You can get it working, by using an ordinal comparission:

" ".EndsWith(((char)9917).ToString(), StringComparison.Ordinal); //false

(Ordinal will ultimately compare the bytes of the chars to determine equality)

dognose
  • 20,360
  • 9
  • 61
  • 107
  • Beware that an ordinal comparison is almost surely wrong if the strings aren't normalised, and even if they are normalised, the comparison may not what's intended (or, in other cases, it may well be). –  Feb 09 '14 at 20:18