32

I use the below function to export an array to a csv files in JavaScript, but the Chinese characters become messy code with Microsoft Excel 2013 in Windows7.

I open the exported file with a notepad but it displays finely.

function arrayToCSVConvertor(arrData, reportTitle) {
    var CSV='';
    arrData.forEach(function(infoArray, index){
        var dataString = infoArray.join(",");
        dataString= dataString.split('\n').join(';');
        CSV += dataString+ "\n";
    });

    if (CSV == '') {
        alert("Invalid data");
        return;
    }

    //create a link and click, remove
    var link = document.createElement("a");
    link.id="lnkDwnldLnk";

    //this part will append the anchor tag and remove it after automatic click
    document.body.appendChild(link);

    var csv = CSV;

    var blob = new Blob([csv], { type: ' type: "text/csv;charset=UTF-8"' });//Here, I also tried charset=GBK , and it does not work either
    var csvUrl = createObjectURL(blob);

    var filename = reportTitle+'.csv';

    if(navigator.msSaveBlob){//IE 10
        return navigator.msSaveBlob(blob, filename);
    }else{
        $("#lnkDwnldLnk")
            .attr({
                'download': filename,
                'href': csvUrl
            });
        $('#lnkDwnldLnk')[0].click();
        document.body.removeChild(link);
    }
}
JaskeyLam
  • 15,405
  • 21
  • 114
  • 149

4 Answers4

74

Problem solved by adding BOM at the start of the csv string:

var csv = "\ufeff"+CSV;
JaskeyLam
  • 15,405
  • 21
  • 114
  • 149
25

This is my solution:

var blob = new Blob(["\uFEFF"+csv], {
    type: 'text/csv; charset=utf-8'
});
yAnTar
  • 4,269
  • 9
  • 47
  • 73
Santy SC
  • 497
  • 5
  • 4
0

According to RFC2781, the byte order mark (BOM) 0xFEFF is the BOM for UTF-16 little endian encoding (UTF16-LE). While adding the BOM may resolve the issue for Windows, the problem still exists if one is about to open the generated CSV file using Excel on MacOS.

A solution for writing a multibyte CSV file that works across different OS platforms (Windows, Linux, MacOS) applies these three rules:

  1. Separate the field with a tab character instead of comma
  2. Encode the content with UTF16-LE
  3. Prefix the content with UTF16-LE BOM, which is 0xFEFF

More detailed elaboration, sample code, and use cases can be seen in this article

mikaelfs
  • 399
  • 3
  • 4
0
var csv = "\ufeff"+CSV;

Explanation about this code:

The BOM character (represented by "\ufeff" in JavaScript) is a special Unicode character that indicates the byte order and the encoding scheme of the text.

Some software applications require the BOM character to be present in UTF-8 encoded files to recognize the file as a UTF-8 encoded text file. For example, Microsoft Excel may not recognize a UTF-8 encoded CSV file without a BOM character, and may display the characters incorrectly.

Therefore, adding the BOM character to the CSV data string ensures that the resulting file is recognized as a UTF-8 encoded text file by most software applications, including Excel.

Meshu Deb Nath
  • 73
  • 1
  • 10