0

I am trying to understand the unicode encoding behaviour and came across the following,

I am writing to a file a string using Encoding.Unicode using

StreamWriter(fileName,false, Encoding.Unicode);

I am reading from the same file but use ASCII intentionally.

 StreamReader(fileName,false, Encoding.ASCII);

When I read the string using ReadLine to my surprise it is giving back the same unicode string.

I expected the string to contain ? or other characters with double the length of the original string.

What is happening here?

Code Snippet

string test= "سشصضطظع";//some random arabic set
StreamWriter s = new StreamWriter(fileName,false, Encoding.UTF8);
s.Write(input);
s.Flush();
s.Close();
StreamReader s = new StreamReader(fileName, encoding);
string ss  = s.ReadLine();
s.Close();
//In string ss I expect to be a ascii with Double the length of test

If I call StreamReader s = new StreamReader(fileName, encoding, false); then it gives the expected result.`

Thanks

HAN
  • 9
  • 5
  • The input is already unicode arabic characters copied from Character map. I found that it is due to the parameter that I am passing. Thanks – HAN May 19 '17 at 13:38
  • Please read [this answer](http://stackoverflow.com/a/700221/2846483). You will note that `Unicode` actually isn't an encoding while `ASCII` is. – dymanoid May 19 '17 at 13:40
  • 2
    @dymanoid in .NET, "Encoding.Unicode" _is_ an encoding, namely [UTF-16 little-endian](https://msdn.microsoft.com/en-us/library/system.text.encoding.unicode(v=vs.110).aspx). Not saying I agree with the naming. – CodeCaster May 19 '17 at 13:41
  • @HAN Please write your answer in an answer post and remove it from the question. – Tom Blodget May 19 '17 at 16:15
  • "When I read the string using ReadLine to my surprise it is giving back the same unicode string." I don't believe it really is (unless it's empty). Please provide a [mcve]. – Jon Skeet May 30 '17 at 11:58
  • Not sure where my answer is gone. The code is updated on the question. The parameter detectEncodingFromByteOrderMarks is responsible for the behavior. – HAN May 31 '17 at 06:06

1 Answers1

0

The parameter detectEncodingFromByteOrderMarks should be set to false when creating StreamReader object.

HAN
  • 9
  • 5