3

I've got an Array in a javascript file that contains Hungarian first names with with special letters like: ö ő ó á í.

var names = ["Aba", "Abád", "Abbás", "Abdiás", "Abdon", "Ábel", "Abelárd"];

(The above is just a shortened array, the whole length is around 3000 item long)

When I want to output the content of "names" in a HTML document I got muffled letters for the non ASCII chars.

If I define the array straight in the UTF-8 encoded HTML where it is outputted I got a clear output list. Where as if I define the array in a JavaScript file I got a muffled content. See the screen: http://screencast.com/t/YJ83K9Mgm

I detected (Notepad++) that the JavaScript file is in ANSI coding.

QUESTION: how can I store the name array (or code containing this special letters in general) so that I can output it in the browsers properly.

(Actually I am using MS Studio Express 2012 for coding. I could not find a place where I can set the coding type of certain files.)

HERE IS THE SIMPLIFIED CODE WIDTH THE ARRAY DEFINED IN THE HTML HEADER:

<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta charset="utf-8" />
    <title>Name List Trial</title>
    <script src="nevekdata.js"></script>
    <script>
        // These are Hungarian first names just a few of them, the whole array is around 1400 long
        // including letters like ö ő ó á stb.
        // !!!!!!!!
        // if I difine  the "names" Array here, the array list is written out in the browser width the
        // special letter seen correctly.        
        var names = ["Aba", "Abád", "Abbás", "Abdiás", "Abdon", "Abdullah", "Ábel", "Abelárd"];
        // if I put it into a javascript file "nevekdata.js" I get muffled chars instead of the correct letters
        function writeOutNames() {
            outputnames.innerHTML = names.toString();
        }
    </script>
</head>
<body>
    <button onclick="writeOutNames()">Write Out names</button>
    <p></p>
    <p id="outputnames"></p>

</body>
</html>
Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
Zoltan
  • 43
  • 5

2 Answers2

5

You already said it yourself, the file is saved in ANSI, but then you serve it as UTF-8. This causes browser to treat your ANSI encoded file as UTF-8.

The charset parameters and headers are just a hint to the browser of what encoding your files are in, it doesn't actually do anything to the "physical" bytes of the file. For this all to work, you need the charset parameter and headers AND encode your file physically to UTF-8 bytes.

You need to encode the file as UTF-8.. in notepad++, save the file as UTF-8 without BOM.

Esailija
  • 138,174
  • 23
  • 272
  • 326
0

HTML escape the special characters.

nevekdata.js

var names = ["Aba", "Ab&#225;d", "Abb&#225;s", "Abdi&#225;s", "Abdon", "Abdullah", "&#193;bel", "Abel&#225;rd"];
function writeOutNames() {
    outputnames.innerHTML = names.toString();
}
Tad
  • 934
  • 6
  • 10
  • That works for a short project, but here I am going to have lots of data imported from utf-8 files. – Zoltan Dec 11 '12 at 17:09
  • Write a program that does the escaping – Tad Dec 11 '12 at 17:10
  • So far I have pulled data from IIS database (I am using c# for app logic and data handling) and did not have any problem with code pages. – Zoltan Dec 11 '12 at 17:12
  • Then use C# to html escape it – Tad Dec 11 '12 at 17:18
  • See the answer to this SO question for how to do it in C# http://stackoverflow.com/questions/1144535/c-sharp-htmlencode-from-class-library – Tad Dec 11 '12 at 17:19
  • Well, I just found one solution, I have converted the .js file to UT-8 and the output is fine. Know I just need to find out how to set VS 2012 so that it creates utf-8 files for me right away. Actually I've got a reason to keep these data in .js files. Of course with coding I can put escape chars in the file but why to do it if not necessary. – Zoltan Dec 11 '12 at 17:24
  • Probably they have put this javascript - ansi type config into the registry somewhere anybody is welcome to help me to find the place that must be edited...I am not well educated at registry editing – Zoltan Dec 11 '12 at 17:35