4

I have to filter out characters in a form. Thus I have implemented a filtering-out algorithm that works quite well and makes use of different filters (variables) according to different contexts; I have to make extended use of accented letters too.

Example:

gFilterALPHA1="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'-–àâäéèêëîïôöùüûÀÂÄÉÈÊËÎIÔÖÙÛÜæÆœŒçÇ ";

Strangely enough, letters é (e acute) or è (e grave) are taken into account (seen as such), while others such as à (a grave) are not. I found the solution is using octal litterals — for instance \340 or \371 for a grave or u grave respectively.

Q1. Any clue about why é (e acute) is succesfully parsed straightforwardly while other accented letters are not?

Q2. Since writing a long string of octal literals is both cumbersome and error-prone when one wants to check or add values, does anyone have a better idea or know of a workaround?

Thanks.

OK, here is the code thg435 thinks it useful to take a look at.

function jFiltre_Champ(event, NomDuFiltre)
{
    var LeChamp=event.target.value; // value est de type ARRAY
    switch (NomDuFiltre)
    {
        case "NUM1":
        LeFiltre=gFiltreNUM1;
        Msg=gMessageNUM1;
        break;
    case "ALPHA1":
        LeFiltre=gFiltreALPHA1;
        Msg=gMessageALPHA1;
        break;
    case "DATE1":
        LeFiltre=gFiltreDATE1;
        Msg=gMessageDATE1;
    break;
    case "ALPHANUM1":
        LeFiltre=gFiltreALPHANUM1;
        Msg=gMessageALPHANUM1;
        break;
    case "ALPHANUM2":
        LeFiltre=gFiltreALPHANUM2;
        Msg=gMessageALPHANUM2;
        break;
}
Longueur=LeFiltre.length;
for (i=0;  i<LeChamp.length; i++)
{
    leCar = LeChamp.charAt(i);
    for (j = 0;  j < Longueur;  j++)
    {
        if (leCar==LeFiltre.charAt(j)) break;
    }
    if (j==Longueur)
    {
        alert(Msg);
    /*Cf doc. pour l'algorithme de la méthode slice*/
        document.getElementById(event.target.id).value=event.target.value.slice("0", i);
        break;
    }
}

}

Here is a English-style version: (regarding (2))

function jform_input_filter(event, filterName)
{
    var current_input = event.target.value; // the value is an array
    switch (filterName)
    {
        case "NUM1":
        current_filter = gFilterNUM1;
        Msg = gMessageNUM1;
        break;
    case "ALPHA1":
        current_filter = gFilterALPHA1;
        Msg = gMessageALPHA1;
        break;
    case "DATE1":
        current_filter = gFilterDATE1;
        Msg = gMessageDATE1;
    break;
    case "ALPHANUM1":
        current_filter = gFilterALPHANUM1;
        Msg = gMessageALPHANUM1;
        break;
    case "ALPHANUM2":
        current_filter = gFilterALPHANUM2;
        Msg = gMessageALPHANUM2;
        break;
}
length = current_filter.length;
for (i = 0;  i < current_input.length; i++)
{
    leCar = current_input.charAt(i);
    for (j = 0;  j < length;  j++)
    {
        if (leCar==current_filter.charAt(j)) break;
    }
    if (j == length)
    {
        alert(Msg);
    /*Cf doc. pour l'algorithme de la méthode slice*/
        document.getElementById(event.target.id).value=event.target.value.slice("0", i);
        break;
    }
}

Comments:

  1. Personally I should not think this code useful to give an answer to the original question;
  2. variables and comments are in French, which may render it difficult to read for some — sorry about that;
  3. this function is associated to an 'onchange' event from within a HTML form;
  4. 'g' variables (e.g. gFiltreALPHANUM2) are broad-scope vectors defined elsewhere in the same .js file so that they are accessible to the function.
Colin Brock
  • 21,267
  • 9
  • 46
  • 61
Brice Coustillas
  • 2,363
  • 2
  • 13
  • 18
  • 1
    http://stackoverflow.com/questions/280712/javascript-unicode – mplungjan Dec 24 '12 at 06:54
  • 1
    http://stackoverflow.com/questions/3939266/javascript-function-to-remove-diacritics – mplungjan Dec 24 '12 at 06:56
  • 2
    Can you show us your code? Also, accept some answers to your questions. – georg Dec 24 '12 at 06:57
  • Hello, thg435. What do you mean 'Also, accept some answers to your questions'? – Brice Coustillas Dec 24 '12 at 08:18
  • Hello, thg435. I was utterly unaware of this "accept answers". I am certainly willing to accept useful answers and always am grateful to those who try to help me, even though bull's eye does not get hit. – Brice Coustillas Dec 25 '12 at 11:15
  • @thg435. I am ready to post jte piece of code althoug I am not sure as how to proceed at this stage. I am working on i. – Brice Coustillas Dec 25 '12 at 11:17
  • 1
    It sounds like your .js file has the wrong encoding. – Bergi Dec 25 '12 at 13:48
  • @Bergi, thanks. I am not sure what you mean exactly,though I know what encoding is. Can you be more specific and possibly explain how this relates to Q1 in my post? – Brice Coustillas Dec 25 '12 at 16:08
  • 1
    If the file is sent with one encoding but received as if it had another (possible due to wrong MIME-type headers or something), only characters from some ranges (mostly including ASCII, but maybe others like é) are correctly interpreted while others are not. Use UTF8 for everything and it should work. – Bergi Dec 26 '12 at 14:42
  • Sidenote: I recommend using English identifier names. – usr Dec 26 '12 at 18:04

1 Answers1

2

Bergi is probably right: your file is probably saved or delivered with the wrong encoding. Consider UTF-8 as a well supported encoding for the Unicode character set. To test this idea, you can temporarily adjust your script to output the a-with-acute-accent into the page, whether in a field or as a text node. Use the verbatim character in your string literal, not its octal escape code. If it comes out garbled, then the character didn't make it in its pristine form into the browser and you've got an encoding problem.

If the encoding problem is confirmed, you'll need to save your file correctly, or adjust the response character encoding, which depends on your particular web server. You can find the current encoding as delivered by your web server by using Fiddler and inspecting the Content-Type response header. If the web server already thinks your file is in the right encoding (preferably, as indicated, UTF-8), then check your text editor to make sure it saves the JavaScript file in the same exact encoding.

I'm writing this as an answer because I don't think I can comment directly on the question.

Mihai Danila
  • 2,229
  • 1
  • 23
  • 28