0

I'm working with a large text file filled with data. Different data blocks there are spitted by a symbol (or a pair of similar symbols) that looks kinda strange and weird. I need to find out, what symbol this is, to properly (!) use it for splitting data blocks when I read the data file. Could you assist me with that?

Here is how the pair of symbols look in Stackoverflow "Ask Question" editing field:

Next I add some pics of how different the symbol looks from place to place:

In original data file

enter image description here

In Brackets Editor (with all the available encodings, it's the same)

enter image description here

In Brave Browser search bar

enter image description here

In Visual Studio 2019

enter image description here

In Stackoverflow (it's different when I type and when in the posted question) editing field

enter image description here

Somewhere it is converted to one of the following

enter image description here

While reading the symbol using C# with Encoding.UTF8 encoding, the console gives next result: enter image description here

But when using Encoding.Unicode, the console gives an infinite set of smth like this:

enter image description here

What exactly do I have to write to make my C# code recognize and react to that symbols?

Outlaw
  • 307
  • 1
  • 3
  • 12

1 Answers1

1

I used this unicode char finder to find out what the characters are.

in order they are...

U+0003 : END OF TEXT [ETX]

U+0001 : START OF HEADING [SOH]

sntrenter
  • 135
  • 5
  • 1
    Just noticed you can't see the characters in the text field above, looks like you have to go into the edit mode to be able to copy/paste or even just see them. I think these are some of the few "zero width" characters in unicode. – sntrenter Oct 30 '20 at 15:29
  • Oh, thank you! Could you please give me a hit of how do I actually use it? Like, I read the line from file, and if it contains the symbols, I break, for example. How do I write them In my code (Python or C#)? – Outlaw Oct 30 '20 at 15:32
  • 1
    python can print unicode characters natively. this post details how you do it. https://stackoverflow.com/questions/10569438/how-to-print-unicode-character-in-python – sntrenter Oct 30 '20 at 15:44
  • I got some results with C#, gonna add them to the question... – Outlaw Oct 30 '20 at 15:54
  • 1
    https://stackoverflow.com/questions/3162116/outputting-a-unicode-character-in-c-sharp – sntrenter Oct 30 '20 at 16:06