I am building an app that downloads a csv file in plain text from an e-mail server and writes it to the local file system. I am developing this app in C# using .NET Core 3.1.
The problem is that I don't know what is the encoding of the files that I am receiving, so I decided to use the StreamReader
class to convert the bytes that I downloaded from the e-mail to a string.
Here is the code
foreach (var data in loadedData)
{
if (IsValidData(data))
{
logger.Info($"Writing data from: {data.FileName}");
using var stream = new MemoryStream(data.FileContent);
using var reader = new StreamReader(stream, true);
var csvData = new CSVData
{
FileName = data.FileName,
FileContent = reader.ReadToEnd(),
};
dataWriter.WriteData(csvData);
logger.Info($"Writing data from: {data.FileName} was successfully written");
}
else
{
logger.Warn($"Invalid format: {data.FileName}");
}
}
And to write the data to the actual files I am using:
public void WriteData(CSVData data)
{
logger.Debug($"Writing received file: {data.FileName}");
var outputDir = config.GetReceivedFilesPath();
string fileName = this.GetOutputPath(data.FileName, outputDir);
Directory.CreateDirectory(outputDir);
using var writer = new StreamWriter(fileName, false, Encoding.UTF8);
writer.Write(data.FileContent);
logger.Debug($"The received data was successfully written to: {data.FileName}");
}
The problem is that some files that I am receiving are encoded in UTF-16 (I believe this is the encodigng that is being used, because there is a \0
after each char), but the StreamReader
is interpreting this file as encoded in UTF-8, because the reader.CurrentEncoding
property returns UTF-8.
The end result is that instead of having my files outputted as UTF-8, my app is outputting them as UTF-16, even though I explicity added UTF-8 as the output value.
What I am doing wrong?