1

I'm trying to implement a simple data import to my MVC application from .csv file.

To remove the need for user to save .csv file template in his computer and from going through the trouble of getting it in the first place, I created a button on my form that generates the template.

I generate a string from my model object in a writer class:

this.Writer.GetCswTemplate() returns string from viewModel (i.e. column1;colum2\r\n)

As per How to GetBytes() in C# with UTF8 encoding with BOM?

I force excel to open .csv file with UTF-8 encoding:

var templateResult = Encoding.UTF8.GetBytes(this.Writer.GetCswTemplate());
var preamble = Encoding.UTF8.GetPreamble();
var templateBytes = preamble.Concat(templateResult).ToArray();

to send the generated template to the user I use MVC File() helper:

return this.File(templateBytes, "application/csv", "filename.csv");

It works great, it generates the template, returns it to the user, opens it with Excel and shows all the special characters in it. If I open the generated file in Notepad++ I can see that it's encoding is UTF-8.

The problem occurs when a user fills the generated template and saves it inside Excel. For some reason Excel decides to change file encoding to ANSI.

Is there any way for me to prevent that? Did I miss something (Add some kind of header or something)?

Interestingly if I generate template with UTF-8 (without BOM), modify said file in excel and save it, Excel does not change its encoding to ANSI. The problem then is that Excel does not recognize the special characters inside template.

Community
  • 1
  • 1
Vilius K.
  • 68
  • 3
  • 8
  • This question doesn't have anything to do with neither C# nor asp.net MVC. – ataravati Dec 14 '15 at 20:29
  • I do not agree that it's a duplicate question. I still believe there has to be a way to force excel to not change encoding, since it doesn't change it when it's set as UTF-8 without BOM. If i won't find a way to do that i will probably go the way that Chriss's suggested on the answer comment and will validate the encoding of files user import, or go away from csv complete and just use excel it self. If that happens i will accept the question as duplicate. – Vilius K. Dec 16 '15 at 09:55

1 Answers1

1

UTF-8 is an encoding set that can contain any Unicode character. Unfortunately, not all applications can encode files in UTF-8 by default, and Microsoft Excel is one of them.

Instead of Unicode, Excel encodes CSV files using ANSI

One would either need to ask the user to open the file in notepad and save in the correct format (to much work!) or work out some detection/conversion logic.

Johnathan Enslin
  • 255
  • 4
  • 13
  • Yeah i figured as much, still doesn't explain why excel does not change the encoding of csv file encoded with UTF-8 without BOM. – Vilius K. Dec 14 '15 at 14:06
  • 1
    A general rule of thumb is always assume the end-user will absolutely destroy whatever you give them. If you intend to allow something that's been in a user's hands to be uploaded back to your site, then you should take all the necessary steps to ensure that it's encoded properly, formatted properly, etc. Shifting this responsibility to the client-side is never a solution. – Chris Pratt Dec 14 '15 at 16:08
  • I found lot of option to save CSV as UTF-8 from different application here. https://help.surveygizmo.com/help/encode-an-excel-file-to-utf-8-or-utf-16 – mujuonly Aug 02 '16 at 11:39