1

I read strings from text file, and among the strings there is one: "15121 ♥☺020 000000/=n531☻". enter image description here

I use .Contain() method to spot ♥☺☻ symbols in the string, but it doesn't recognize them. For ♥,☺,☻ I also tried \u2665, \u263a, \u263b (as arguments for .Contain()), but none of them were recognized. Moreover, I copied the string (from console output window) and pasted it into my code to compare symbols one by one.

string s = "15121 ♥☺020 000000/=n531☻"; // the same with "15121 \u2665\u263a020 000000/=n531\u263b"
for (int j = 0; j < s.Length; j++)
{
    Console.WriteLine($"{line[j]} == {s[j]}: {line[j].Equals(s[j])}");
}

This is what I got:

enter image description here

What may be wrong and how do I recognize those symbols?

UPDATE: The input file I read strings from is a usual text file, and the strings inside looks like these (txt opened in Notepad): enter image description here

As you can see, there is THE string among the others.

I don't use any encoding when reading the txt, and to specify how I do read the file and what the line is, here is my code:

            string[] lines = File.ReadAllLines(path + target_file_name);
            var list = new List<string>(lines);
            for (int i = 0; i < list.Count; i++)
            {
                string line = list[i];
                if (line.Length > 21)
                {
                    Console.WriteLine(line);
                    if (line.Contains("/=n"))  //used just to catch THE string
                    {
                        var line_b = Encoding.Unicode.GetBytes(line);
                        Console.WriteLine($"{line_b} : line = {line}");
                        foreach (byte m in line_b)
                        {
                            Console.Write(m + " ");
                        }

                        string s = "AAXX 15121 ♥☺020 000000/=n531☻";
                        Console.WriteLine();
                        var s_b = Encoding.Unicode.GetBytes(s);
                        Console.WriteLine($"{s_b} : s = {s}");
                        foreach (byte n in s_b)
                        {
                            Console.Write(n + " ");
                        }
                        Console.WriteLine();

                        for (int j = 0; j < s.Length; j++)
                        {
                            Console.WriteLine($"{line[j]} == {s[j]}: {line[j].Equals(s[j])}");
                        }
                    }

Reading all the lines from txt and converting them to List is a must for me. Thus, the line is a string line from initial txt file.

I have dumped the bytes of the inout text string var b = Encoding.Unicode.GetBytes(line); and compare with my literal (@pm100), and here is the result. Not quite sure what it does give me: enter image description here

I'm sorry, I'm not willing to publish my code, and I may not understand some of your suggestions for I'm not very proficient in C# (and coding in general). So I would appreciate any further help as it is, if possible.

Outlaw
  • 307
  • 1
  • 3
  • 12
  • 5
    How are you reading the text file, and are you specifying an encoding for the data? – stuartd Mar 19 '22 at 18:14
  • 1
    Dump the bytes of the inout text string `var b = Encoding.Unicode.GetBytes(line);`. and compare with your literal – pm100 Mar 19 '22 at 18:18
  • 1
    Your code does not compile so the issue is not reproducible. What does `line` contain? Now we only can have educated guesses. Like [this one](https://stackoverflow.com/q/64833645/5114784) (`Contains` and `IndexOf` might have the same problem). – György Kőszeg Mar 19 '22 at 18:19
  • 1
    Try to dump out the codes of both side of your comparisons; not all codes show different glyphs.. – TaW Mar 19 '22 at 20:11
  • 1
    Please upload the file for us to see too. I took the line from your question and pasted it as the `line` variable but it's identical to the line in `s` - https://dotnetfiddle.net/5Vd1lB – Caius Jard Mar 20 '22 at 06:07
  • *I copied the string (from console output window) and pasted it into my code to compare symbols one by one* - it's generally a terrible idea if you're trying to preserve data integrity, to copy data that hs been printed. To explain it better, would you expect to be able to open an exe in notepad, Ctrl A, Ctrl C, go to a new notepad, Ctrl V, save as my2.exe and it be able to run? – Caius Jard Mar 20 '22 at 06:11
  • You can upload the original file to e.g. github – Caius Jard Mar 20 '22 at 06:13
  • Thank you all for you suggestions, I've updated the question with some of the answers. – Outlaw Mar 20 '22 at 10:41
  • One problem is that ❤️ is not a single char, but two – Hans Kesting Mar 20 '22 at 12:36

0 Answers0