2

I got a string $str = "颜色代码";, I would like to check if this string contain "颜色". I've tried using the code below, but I keep getting false return.

mb_strpos($str, "颜色", 0 ,"GBK");
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • [Works fine for me.](http://sandbox.onlinephpfunctions.com/code/90b922307105459e2004cca643c969306e5669a2) – Tomáš Zato Jun 21 '14 at 16:12
  • @TomášZato It does work. But not as desired by the requirement “I would like to check if this **string contain**…” So sting position `0` is correct, but not desired for detection of whether the string contains the value. – Giacomo1968 Jun 21 '14 at 16:20
  • What? If you get **false** it doesn't contain it. **0** is just fine. – Tomáš Zato Jun 21 '14 at 16:22
  • @TomášZato The difference between `0` and `false` is not clear to many programmers & that is fair. Using `preg_match` simplifies the logic and is tested to be 3x faster than `mb_strpos` so it’s a win-win scenario. – Giacomo1968 Jun 21 '14 at 16:42

3 Answers3

1

Maybe you just have forgotten to check whether the value is integer:

if(mb_strpos($str, "颜色", 0 ,"GBK")===false)
    echo "The value does not contain \"颜色\"\n";
else
    echo "\"颜色\" is part of the string."

The three = invoke a strict type comparison. Normally, false equals 0, but they are of different variable types - bool and int respectively.

In the documentation of strpos, which acts similarly, there's a big red warning:

Warning

This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE. Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.

Tomáš Zato
  • 50,171
  • 52
  • 268
  • 778
0

Try using utf8_decode:

mb_strpos($str, utf8_decode("颜色"), 0 ,"GBK");
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
Dima
  • 8,586
  • 4
  • 28
  • 57
  • Not the problem. The issue is `strpos` returns `0` because that is the string position. But `0` can be interpreted as `false` which is not desired. So `utf8_decode` has nothing to do with it. – Giacomo1968 Jun 21 '14 at 16:18
  • Still the same var_dump(mb_strpos($str, utf8_decode("颜色"), 0 ,"UTF-8")); return boolean false – user3687053 Jun 21 '14 at 16:18
  • @user3687053 It is returning `false` because the `utf8_decode` is not correct. It is indeed false when decoded UTF8 is matched with undecided UTF8. The issue is the confusion from `strpos` returning `int(0)` which is correct but can be misinterpreted as `false`. – Giacomo1968 Jun 21 '14 at 16:24
0

The code does work:

$str = "颜色代码";
$test = mb_strpos($str, "颜色", 0 ,"GBK");
echo $test;

But the problem you are facing is because the 颜色 strpos returns 0 which is the correct string position but your code logic might misinterpret that as false. To see what I mean take the 颜色 and place it at the end of the string like this:

$str = "代码颜色";
$test = mb_strpos($str, "颜色", 0 ,"GBK");
echo $test;

And the returned string position is 3 which is correct as well. A better approach to simply see if the 颜色 is in the string is to use preg_match like this:

$str = "颜色代码";
$test = preg_match("/颜色/", $str);
echo $test;

And the output for that would be a boolean 1 which equates to true which I believe is what you are looking for.

Beyond the functionality working as expected, there is a clear speed benefit to using preg_match over mb_strpos as shown here.

mb_strpos: 3.7908554077148E-5
preg_match: 1.1920928955078E-5

It’s more than 3x faster to use preg_match when compared to mb_strpos.

Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • 2
    Regulary expression to overcome type casting? I disagree! – Tomáš Zato Jun 21 '14 at 16:23
  • @TomášZato ??? It works. Perhaps your typecasting logic makes sense, but the fact that `0` can be returned causes confusion when doing a `false` comparison. – Giacomo1968 Jun 21 '14 at 16:26
  • 2
    It confuses unwary programmer, that's all. As soon as you know about it, it does not confuse you any more. – Tomáš Zato Jun 21 '14 at 16:28
  • @TomášZato Does it mean anything to you that `preg_match` is about 3x faster that `strpos` in this case? Because it is. So while you are correct about returned type, `prey_match` wins because of relative clarity in it’s response as well as speed performance. – Giacomo1968 Jun 21 '14 at 16:40
  • That's very interesting, if you consider what both functions are expected to do. Maybe there is something wrong with `mb_strpos`. How many test cases did you run? – Tomáš Zato Jun 21 '14 at 17:26
  • I was just wondering why is this happening, that's all. Anyway it seems that `mb_strlen` is badly optimised and regexp is really fastest solution until it's fixed. Oh, and as of selecting the answer - I have posted my first comment before I even considered answer of my own. Only when I realised that `===` is not as know as I thought I posted an answer. – Tomáš Zato Jun 21 '14 at 17:43
  • And don't think that this applies to `strpos` too. `mb_strpos($str, "颜色", 0 ,"GBK"): 15.988190889 (89%)` `preg_match("/颜色/", $str): 1.022506952 (6%)` `strpos($str, "dh"): 0.934401989 (5%)` – Tomáš Zato Jun 21 '14 at 18:06