1

I am trying to determine whether a given strings contains more than 4 consecutive arabic (hindi) numerals. to be specific, arabic (hindi) numerals are :

١ ٢ ٣ ٤  ٥ ٦ ٧ ٨ ٩

which are unicode 661 to 669

I tried :

if (preg_match("/\b(?:(?:١|٢|٣|٤|٥|٦|٧|٨|٩)\b\s*?){4}/", $str, $matches) > 0) 
        return true;

But it doesn't work at all (always returns false).

Toto
  • 89,455
  • 62
  • 89
  • 125
Sherif Buzz
  • 1,218
  • 5
  • 21
  • 38

2 Answers2

5

You can try the following regular expression. \p{N} matches any kind of numeric character in any script.

preg_match('~(?:\p{N}\s?){4,}~u', $str, $matches)

If you just want to match those specific characters, you could use the following instead.

preg_match('~(?:[\x{0660}-\x{0669}]\s?){4,}~u, $str, $matches)
Community
  • 1
  • 1
hwnd
  • 69,796
  • 4
  • 95
  • 132
2

Use a character class and quantify it. See this regex:

/[١٢٣٤٥٦٧٨٩]{4,}/

Your characters are not word characters, so \b would assert a word character in front of / behind your match, remove it.

Here is a regex demo.

As a note, if you are matching more than 4 characters, use {5,} instead.

Unihedron
  • 10,902
  • 13
  • 62
  • 72
  • This works fine in the regex demo, but using it in preg_match does not work properly. "Omar١٢٣" returns a positive match. – Sherif Buzz Sep 16 '14 at 14:52