asp.net core running on docker not encoding Latin characters correctly

Question

Asp.net core 2.0 web api
running in a Docker container using the official Microsoft Docker image (microsoft/aspnetcore)

Code:

 [HttpGet]
 [Route("test")]
 public IActionResult Get()
 {
    return Ok("Sedán");
 }

Problem:

The word Sedán gets encoded to Sedï¿½n when running in Docker. On Windows it gets encoded to SedÃ¡n which is correct

score 0 · Answer 1 · answered Mar 04 '21 at 17:08

I know that this post has 3 years old, but this could help future developers when they find this kind of problem.

After a little of research, I found out that the string encoding in .net is UTF-16.

"It depends where the string 'came from'. A .NET string is Unicode (UTF-16). The only way it could be different if you, say, read the data from a database into a byte array.".

So, my suspicion is that if you have an environment that has a preset configuration of an encoder that doesn't support those kinds of characters, for example, Encoding.Unicode, it will show a messy string:

public static void Main()
{
    string testString = "Sedán";
    Console.WriteLine(Utf16ToUnicode(testString));
}

public static string Utf16ToUnicode(string utf16String)
{
    // Get UTF16 bytes and convert UTF16 bytes to UNICODE bytes
    byte[] utf16Bytes = Encoding.Unicode.GetBytes(utf16String);
    byte[] unicodeBytes = Encoding.Convert(Encoding.Unicode, Encoding.Unicode, utf16Bytes);

    // Return UNICODE bytes as ANSI string
    return Encoding.Default.GetString(unicodeBytes);
}

The output: Sed�n

I had a similar problem. My docker container was running under a Debian 10 image and according with this article, does not have default locale set. I don't know which implications this could lead to, but in my case the "replacement character" shows when I try to render a currency unit if the encoding string is UTF-16. So to solve this problem I used the resource manage from .NET in order to obtain the value as UTF-8(Note: I could programmatically transform the UTF-16 to UTF-8, as I show in the code example above, but its a high cost operation).

score 0 · Answer 2 · answered Nov 17 '22 at 21:31

I had the same issue with asp.net .net 6 in docker (Alpine, Ubuntu). App returns � for non-English characters when a string was set in the c# source file, e.g. var str = "Sedán";

The fix is to save the c# file containing the string variable with a different encoding. I had windows 1250, and changing to utf-8 fixed the issue.

For VS 2022, go to File -> Save [File] As and hit the caret next to the save button, then Save with encoding.

asp.net core running on docker not encoding Latin characters correctly

2 Answers2