3

I have finally figured out how to validate inserted russian text in my tag by

    mask = jQuery.extend({unitprmask:/^\d+(\.\d{1,1})?$/,expressmask:/^\d+(\.\d{1,1})?$/,qtymask:/^\d+(\.\d{1,1})?$/,yourdescdmask:/^[а-яА-Я\p{Cyrillic}0-9\s\-]{1,10}$/,URLmask:/^(https?|ftp):\/\/(.*)[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/});

But I want to allow users to write in the in mixed languages " Russian + Persian + Chinese + English". how can I make it happened using regex? what should I add to the above code, I have tried to add a-zA-Z to validate english letters but i wasn't successful. I am actually not able to add other languages, the one I wrote

  yourdescdmask:/^[а-яА-Я\p{Cyrillic}0-9\s\-]{1,10}$/

will allow Russian text only.

Could you please help me out here?

Thanks

Kissa Mia
  • 297
  • 8
  • 23
  • 1
    Wouldn't it be easier to have a different regex for each language? – Liath Jan 29 '14 at 14:57
  • no, it wouldn't be. There are tons of languages with tons of different alphabets and punctuation, but I don't know much about unicode support in regex. – bigblind Jan 29 '14 at 15:00
  • What do you want to allow? Maybe [Unicode Graphical Characters](http://en.wikipedia.org/wiki/Graphic_character#Unicode) is a good starting point. Some written languages are based on letters and combining marks, but not all. – Mike Samuel Jan 29 '14 at 15:03
  • 3
    Btw, `\p{Cyrillic}` doesn't have the same meaning in JavaScript as in Java -- it's equivalent to `[Cilpry\{\}]`. – Mike Samuel Jan 29 '14 at 15:04
  • @gpgekko I have tried all the suggestions in there, no one worked for me, and not only the suggestions in stack overflow but I have googled it too, still no luck, I am not that lazy to just post and ask for help, I did try many ways to fix my problem since 8am and now its 11pm dear, but thanks anyway please aster the question and respect others who are trying to help – Kissa Mia Jan 29 '14 at 15:08
  • @Liath yes I thought so, as I just need English / Russian / Persian / Chinese but how to add other languages to what I have now ?? do you have any idea? appreciated – Kissa Mia Jan 29 '14 at 15:09
  • You may have more success if you give us the text you're trying to validate and the regex you want modifying... as it stands this is quite vague – Liath Jan 29 '14 at 15:11
  • @Liath there is no fix text here, it's a form validator, yourdescdmask will validate – Kissa Mia Jan 29 '14 at 15:15
  • I have edited the questions, I hope I was success to make myself clear :) – Kissa Mia Jan 29 '14 at 15:27

1 Answers1

3

JavaScript does not support Unicode Script syntax, such as\p{Cyrillic}.
If you don't believe me, you may check the specification of ECMA-262. As of 5th edition, it is not supported.

The only way to do a strict multilingual validation in JavaScript is to list range of characters, e.g. \uxxxx-\uyyyy.

The Unicode Script syntax is simply implemented based on the data file by Unicode Consortium.

Using the Unicode Script syntax directly is convenient and make the code clean. However, you don't know what exactly is matched by the character class. You may use this chance to look at the list of characters and filter out whatever you don't want.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162