How to tell if a string contains characters in Hebrew using PHP?

Question

Trying to figure out how to tell whether a string contains any characters in Hebrew with no luck.

How can this be done?

I thinks this link will help you http://stackoverflow.com/questions/1694350/how-can-i-detect-hebrew-characters-both-iso8859-8-and-utf8-in-a-string-using-php — Cas van Noort, Dec 18 '11 at 00:53

score 16 · Accepted Answer · answered Dec 18 '11 at 01:17

16

If the source string is UTF-8 encoded, then the simpler approach would be using \p{Hebrew} in the regex.

The call also should have the /u modifier.

 = preg_match("/\p{Hebrew}/u", $string)

answered Dec 18 '11 at 01:17

mario

144,265
20
237
291

Doesn't this miss a `\` in `\\p`? – fge Dec 18 '11 at 09:39
@fge: If you want to be super correct :) But `"\p"` is no C-string escape, so will correctly reach the PCRE library as `\p` – mario Dec 18 '11 at 09:44
Hmm, so you don't need to escape backslashes in PHP's string literals? I didn't know that. – fge Dec 18 '11 at 09:51
@fge: There are only a few you need to escape. For example `"\r\n\t"`. Or otherwise use single quotes where all lose their special meaning. – mario Dec 18 '11 at 09:53
Well yes, but here you use double quotes to surround `/\p{Hebrew}/u`. I didn't say the regex wasn't correct, I was simply guessing that a `\` was missing, no? – fge Dec 18 '11 at 09:55
Yes, talking about PHP string escapes. In PHP double quotes only `"\n"` and `"\r"` get transliterated into linebreaks in the actual variable value. But `"\p"` has no special meaning, so will remain `\p` in the actual variable value. See http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.double – mario Dec 18 '11 at 09:58
OK, that makes sense! Gee, all languages seem to do that differently :p Thanks for your time! – fge Dec 18 '11 at 10:00
No, worries. PHP is especially peculiar there. And just noticed the manual explains zilch. Will fix that tomorrow... – mario Dec 18 '11 at 10:01

score 3 · Answer 2 · answered Dec 18 '11 at 01:54

map of the iso8859-8 character set. The range E0 - FA appears to be reserved for Hebrew.

[\xE0-\xFA]

For UTF-8, the range reserved for Hebrew appears to be 0590 to 05FF.

[\u0590-\u05FF]

Here's an example of a regex match in PHP:

echo preg_match("/[\u0590-\u05FF]/", $string);

score 2 · Answer 3 · answered Sep 24 '14 at 06:24

The simplest approach would be:

preg_match('/[א-ת]/',$string)

For example,

$strings = array( "abbb","1234","aabbאאבב","אבבבב");

foreach($strings as $string)
{
    echo "'$string'  ";

    echo (preg_match('/[א-ת]/',$string))? "has Hebrew characters in it." : "is not Hebrew";

    echo "<br />";
}

How to tell if a string contains characters in Hebrew using PHP?

3 Answers3

Linked