C# portuguese character issue

Question

we have a console application where we are reading a fixed length file which has a field , map the some data fields and create a pipe delmlimited files.

The fixed length file has some portuguese characters in it when read that field and write it to the pipe delmited file it text is getting mangled.

See example below:

ATUAL.DE PU POS. BANC. POSIT C.PROPRIA LONGO PZO DISPONIVEL PARA VENDA CDB PàS P SNA

ATUAL.DE PU POS. BANC. POSIT C.PROPRIA LONGO PZO D|ISPONIVEL PARA VENDA CDB P�S P SNA

Below is how I am writing it to the file: DO I use the encoding here?

using (StreamWriter writer = new StreamWriter(transOutput))
            {
                writer.WriteLine(line);
            }

Is there any way to handle these characters.

*Is there any way to handle these characters.* yes, by reading them in the same encoding as you write them — Selvin, Dec 12 '19 at 15:36
The encoding is wrong. Use ISO-8859-1. See : https://dev.mysql.com/doc/dev/connector-net/8.0/html/T_MySql_Data_MySqlClient_MySqlDataAdapter.htm — jdweng, Dec 12 '19 at 15:38
why it has +1? there is no code, there is no chance to answer other than guessing that he used wrong encoding — Selvin, Dec 12 '19 at 15:41
@jdweng No. Use UTF-8 instead of an obsolete Mickey Mouse encoding. And the link you provided is irrelevant to the question, which has nothing to do with databases, let alone a specific one (MySQL). — Konrad Rudolph, Dec 12 '19 at 15:47
I pasted wrong page (https://community.atlassian.com/t5/Jira-questions/File-encoding-for-Portuguese-Brazilian/qaq-p/680609). You need to use an encoding that prints Portugese characters correctly. UTF-8 will not do. — jdweng, Dec 12 '19 at 15:52
@selvin - using (var reader = new TextFieldParser(fileToProcess)) { — user565992, Dec 12 '19 at 16:08
ok but what `TextFieldParser` is using ? and what is it? is it `Microsoft.VisualBasic.FileIO.TextFieldParser` ? if so, then should works ok because both `StreamWriter` and `Microsoft.VisualBasic.FileIO.TextFieldParser` should use UTF8 as default encoding — Selvin, Dec 12 '19 at 16:09
@jdweng Where do you get this from? Of course UTF-8 will do, UTF-8 can represent *all* Unicode characters. By contrast, ISO-8859-1 was already obsolete 20 years ago, and is usually implemented incorrectly. It should **never** be used nowadays. — Konrad Rudolph, Dec 12 '19 at 16:10
If you are converting byte array to a string you need to use the correct encoding otherwise the character will be mapped wrong. — jdweng, Dec 12 '19 at 16:20
@jdweng - while reading the file in step 1 or when writing it to a new file .I tired using that other encoding it is not helping — user565992, Dec 12 '19 at 18:29
Neither. The encoding is important when converting from a byte array (or the opposite) to a string. Once it is a string it can only be fixed by going to byte array and back to string. Character values 0x00 to 0x7F are always the same. Encoding takes the values 0x80 to 0xFF and maps to two byte unicode characters. So once the wrong character is mapped to the value 0x80 to 0xFF it can't easily get fixed. — jdweng, Dec 12 '19 at 19:56

C# portuguese character issue

0 Answers0