1

Why/how function string.Substring treats "\u0002" as a one sign? I mean ok, "\u0002" is a "character" STX.

  1. \u says that is unicode
  2. Character and string processing in C# uses Unicode encoding. The char type represents a UTF-16 code unit, and the string type represents a sequence of UTF-16 code units.

Code checks if prefix and suffix are correct. Data length does not matter.

Prefix is STX , suffix is ETX added do data string.

How to do this(code below) explicitly without a doubt?

    string stx = "\u0002";
    string etx = "\u0003";
    string ReceivedData= stx + "1122334455" + etx;
    
    string prefix = ReceivedData.Substring(0, 1);
    string suffix = ReceivedData.Substring(ReceivedData.Length - 1, 1);
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Artur
  • 95
  • 9
  • I'm not sure I understand the question. What is a "one sign"? It's not interpreted as character `1`, it's interpreted as [control character `STX` (start of text)](https://www.fileformat.info/info/unicode/char/0002/index.htm). – CherryDT Apr 17 '21 at 13:52
  • Your string is `␂1122334455␃` at the end (here with the invisible control characters replaced by visible equivalent representations) - an STX character followed by 10 characters `1122334455` followed by an ETX character, 12 characters in total. – CherryDT Apr 17 '21 at 13:57
  • 2
    Ah I see you edited your question. `\u0002` is just a representation for a single STX character, just like `\u0041` would be another representation for the character `A`, or `\n` would be a representation for a newline `␊` (which you could also write as `\u000a`). In memory it's not literally `\ ` `u` `0` `0` `0` `2`, it's just one character `␂`. – CherryDT Apr 17 '21 at 14:08
  • 2
    If you're looking for a reference, `\u` is an *escape sequence.* See [here](https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/). – Robert Harvey Apr 17 '21 at 14:21
  • So by adding \u to the string I point that compilator should interpreted/find character in unicode after \u ? Mor or less:) Thanks – Artur Apr 17 '21 at 14:22
  • https://csharpindepth.com/articles/Strings, see "Literals" – CherryDT Apr 17 '21 at 14:22

1 Answers1

3

Do you wonder the working mechanism of UTF-16 and Unicode? May this topic helps: What is Unicode, UTF-8, UTF-16?

The code snippet looks reasonable as the variables are explicitly named and '\u' is a sign of Unicode.

string stx = "\u0002";
string etx = "\u0003";

string prefix = ReceivedData.Substring(0, 1);
string suffix = ReceivedData.Substring(ReceivedData.Length - 1, 1);
Jeff
  • 126
  • 4