7

I want to use a regular expression which will allow

  1. English text which does not have a special character.
  2. French Text which does not have a special character.

It will always disallow special characters like @, #, % etc... in both the language.

I have tried with the below code:

if (this.value.match(/[^a-zA-Z0-9 ]/g)) {
    this.value = this.value.replace(/[^a-zA-Z0-9 ]/g, '');
}

It works fine with english text, but the problem is when I provide a french text like éléphant, it considers the french characters as special character, and deletes the french characters. so éléphant becomes lphant.

Is there any way to allow the french characters inside the regular expression?

Thanks a lot in advance.

  • possible duplicate of [Matching accented characters with Javascript regexes](http://stackoverflow.com/questions/5436824/matching-accented-characters-with-javascript-regexes) – Cristian Lupascu Oct 29 '13 at 07:38
  • 1
    [a nice resource for this](http://kourge.net/projects/regexp-unicode-block).... – Wrikken Oct 29 '13 at 07:52

3 Answers3

13

Quick solution:

/[^a-zA-Z0-9 àâäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ]/

Reference: List of french characters

Hope this helps

mortb
  • 9,361
  • 3
  • 26
  • 44
5

Most simplified solution:

/[^a-zA-ZÀ-ÿ]/  

(or)

/[\wÀ-ÿ]/       // Note: This will allow "_" also

Any of the above regular expression will work in your case.

Sam G
  • 1,242
  • 15
  • 12
0

I would suggest normalizing string before replacing chars.

This example is a JAVA normalization, but maybe this example could help you with javascript

    String string = "éléphante";

    string = Normalizer.normalize(string, Normalizer.Form.NFD);

    string = string.replaceAll("[^\\p{ASCII}]", "");

    System.out.println(string.replaceAll("[^a-zA-Z0-9 ]", ""));
Community
  • 1
  • 1
Andrés Oviedo
  • 1,388
  • 1
  • 13
  • 28