66

I have two strings.

one is "\""

and the other is "\""

I think that they are same.

However, String.Compare says they are different.

This is very strange.

Here's my code:

string b = "\"";
string c = "\"";

if (string.Compare(b, c) == 0)
{
    Console.WriteLine("Good");
}

if (c.StartsWith("\""))
{
    Console.WriteLine("C");
}

if (b.StartsWith("\""))
{
    Console.WriteLine("B");
}

I expected that it may print "GoodCB".

However, it only prints "B".

In my debugger, c[0] is 65279 '' and c[1] is 34 '"'. and b[0] is '"'.

But I don't know what 65279 '' is.

Is it an empty character?

Ronan Boiteau
  • 9,608
  • 6
  • 34
  • 56
장선민
  • 781
  • 1
  • 6
  • 7
  • What does your string come from? You're probably reading it wrong. – SLaks Jul 22 '11 at 02:13
  • 4
    It very commonly appears as the first character in a utf-16 encoded text file. Use StreamReader, not FileStream. – Hans Passant Jul 22 '11 at 04:19
  • This is very likely related to this excellent answer/explanation here (TL;DR use `StreamReader` if the `string` was loaded from a `Stream`, use `Encoding.GetString()` if it was loaded from `Encoding.GetBytes()`; do not mix the two): https://stackoverflow.com/a/11701560/7293142 – CajunCoding Aug 05 '22 at 23:38

5 Answers5

97

It's a zero-width no-break space.
It's more commonly used as a byte-order mark (BOM).

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 9
    How can I remove that char when I cannot sure whether it starts with '' or not? – 장선민 Jul 22 '11 at 02:07
  • Thank you so much! I was hitting the wall till I found your solution! – Adrian Grzywaczewski Nov 02 '17 at 21:15
  • 1
    Great response, which lead me to continue looking and find this this excellent answer/explanation here (TL;DR use `StreamReader` if the `string` was loaded from a `Stream`, use `Encoding.GetString()` if it was loaded from `Encoding.GetBytes()`; do not mix the two): https://stackoverflow.com/a/11701560/7293142 – CajunCoding Aug 05 '22 at 23:36
9

If you are using Notepad++, try converting to UTF-8 (no BOM), and also make sure ALL your files in the project are the same file system format.

kurdtpage
  • 3,142
  • 1
  • 24
  • 24
5

You can remove it with:

Trim(new char[]{'\uFEFF','\u200B'});
Victor
  • 893
  • 1
  • 10
  • 25
4

If you are reading from a file you have opened in notepad, it may have added it as it is one of several programs notorious for doing so.

Dan Witkowski
  • 351
  • 1
  • 4
  • How can I remove that char when I cannot sure whether it starts with '' or not. – 장선민 Jul 22 '11 at 02:05
  • Notepad and other programs are saving UTF8 files, which is a valid and common format. The BOM only bothers you when you read the file with the wrong encoding. – SLaks Jul 22 '11 at 02:14
  • I want to determine whether '' is exist or not using c[0] == '' but I cannot build this. – 장선민 Jul 22 '11 at 02:31
-1

It is byte order mark(BOM). A BOM is a special marker at the beginning of a file that indicates the byte order of the text data in the file.

We can remove the BOM in JavaScript using the following code

    function removeBOM(jsonString) {
        if (jsonString.charCodeAt(0) === 0xfeff) {
            jsonString = jsonString.slice(1);
        } 
        return jsonString;
    }
Sandeep
  • 1
  • 1