-2

This question asks how to match if any characters in a string are arabic; fair enough, the solution works. I want to know how I can figure out if ALL characters in a string (minus whitespace) are arabic.

The solution says

According to Wikipedia, Arabic characters fall in the Unicode range 0600 - 06FF. So you can use a regular expression to test if the string contains any character in this range:

var arabic = /[\u0600-\u06FF]/;
var string = 'عربية‎'; // some Arabic string from Wikipedia

alert(arabic.test(string)); // displays true

How can I get var arabic = /[\u0600-\u06FF]/; to match all characters and not just check for one character?

I am looking for a regex statement that I could use to match against a string, and its return value would indicate whether or not all the characters in a string are arabic characters.

var arabic = /[\u0600-\u06FF]/; <-- This is a regex statement when, when matched against a string, will indicate if any (one or more) of the characters in said evaluated string have arabic.

The solution I came up with is, ^[\u0600-\u06FF]{1,}$. I don't know if this is best, I'm not regex savvy, hopefully someone who needs it can see it.

baudsp
  • 4,076
  • 1
  • 17
  • 35
John Lexus
  • 3,576
  • 3
  • 15
  • 33

1 Answers1

2

You can use the test() method of the RegExp.prototype.test() and check for the opposite, and negate it.

Remember

Remember that Arabic string and Arabic text is not limited to using the Arabic Characters /[\u0600-\u06FF]/ because in Arabic text we also use the Latin numbers and the punctuation marks like / \ ) ( ! , etc.

The Arabic Unicode Block /[\u0600-\u06FF]/ see here on Wikipedia Arabic (Unicode block) does not have these punctuation marks and symbols.

So also keep in mind that the following basic Latin Unicode exist in Arabic text:

\u0020-\u0040

\u005B-\u0060

\u007B-\u007E

Remember, even the space char is stored as Latin Unicode.

Here is a short function isItAllArabic that returns true or false:

If you want to figure out if ALL characters in a string (minus whitespace) are Arabic, you can modify the function like this:

const isItAllArabic =s=>(!/[^\u0600-\u06FF ]/.test(s));

//======================================================
// Test if a string has only Arabic Characters
// Latin punctuation marks and number exist in Arabic
// Strings as there mostly used due to some not existing
// in Arabic Unicode letter.
//
// Output: Return true/false
//======================================================
const isItAllArabic =s=>(!/[^\u0600-\u06FF\u0020-\u0040\u005B-\u0060\u007B-\u007E]/.test(s));

//======================================================

console.log(isItAllArabic("محسن"));              // true. All Arabic text
console.log(isItAllArabic("(محسن)"));            // true. Symbols () ignored
console.log(isItAllArabic("محسن/محمد! وعلي"));  // true. Punctuations and Symbols ignored
console.log(isItAllArabic("محسن 123"));          // true as numbers are ok
console.log(isItAllArabic("محسن mohsen"));       // false because latin chars
console.log(isItAllArabic("mohsen"));            // false
Nimantha
  • 6,405
  • 6
  • 28
  • 69
Mohsen Alyafei
  • 4,765
  • 3
  • 30
  • 42