PHP check text is in English language?

Question

Hello everyone below is given code I want to test string is in English or in Gujarati. But program giving wrong output how can I solve this? If the character is ASCII is from 0A80-0AFF this length then considers as Gujarati else consider as English.

Code:

if (!preg_match('/[^A-Za-z0-9]/', $Query)){
    echo 'English';
}
else{
    echo 'Gujarati';
}

Input:

A/4

Output:

Gujarati

Expected output:

English

Since you're only checking against alphanumeric characters, any input containing anything else (like slashes/hyphens etc) will be considered to be Gujarati. — M. Eriksson, Feb 26 '18 at 06:00

Allan · Accepted Answer · 2018-02-26T06:14:58.947

3

In a case where you have only, English and Gujarati, why don't you do it the other way around?

if (preg_match('/\x{0A80}-\x{0AFF}/u', $Query)){
    echo 'Gujarati';
}
else{
    echo 'English';
}

Basically if you have one character from Gujarati language it will be detected as Gujarati else it will be English. However note that 月,ありがとう, élève, etc will also be considered as English

Have a look at this Unicode chart: https://unicode.org/charts/PDF/U0A80.pdf to define exactly the range that must be taken into account.

Explanations:

\x{0A80}-\x{0AFF} to match characters between code points U+0A80 and U+0AFF
/u for Unicode support in regex

edited Feb 26 '18 at 06:14

answered Feb 26 '18 at 06:08

Allan

12,117
3
27
51

I want the same solution but it covers 0A80-0AFF range? So I can use it. – Shah Rushabh Feb 26 '18 at 06:11
@ShahRushabh: I have edited the regex to match your range can you have a look? – Allan Feb 26 '18 at 06:13
1

Thank you so much you saved my project. – Shah Rushabh Feb 26 '18 at 06:14
@ShahRushabh: Happy to help you ;-) – Allan Feb 26 '18 at 06:15
Allan, is this correct range syntax? '/\x{0A80}-\x{0AFF}/u'. Not this? '/[\x{0A80}-\x{0AFF}]/u' – A. Denis Feb 26 '18 at 06:21
@A.Denis I am not an expert of Gujarati language that's why I gave him the Unicode chart for Gujarati language, he can adapt the regex if necessary – Allan Feb 26 '18 at 06:23
@Allan I am asking about that you don't use those square brackets. I always think that brackets must be. – A. Denis Feb 26 '18 at 06:28
please check this, function is_english($str) { if (strlen($str) != strlen(utf8_decode($str))) { return false; } else { return true; } – shijinmon Pallikal Jun 23 '20 at 01:23

PHP check text is in English language?

1 Answers1