2

I am asking this question considering the performance of script. Knowing that PHP arrays don't perform very well, I am wandering which way is the best to go down when in this sort of situations.

Suppose if $x equals to a or b or c or d we need action_a() to execute and if not action_b() to execute..

We can either implement this with || operator as follows;

if($x == 'a' || $x == 'b' || $x == 'c' || $x == 'd'){
       action_a();
}else{
       action_b();
}

Or we can implement this using in_array() as follows;

if(in_array($x,array('a','b','c','d'))){
       action_a();
}else{
       action_b();
}

What I would like to know is which of these two options would perform well:

  1. when the number of possible values for $x are high?

  2. when the number of possible values for $x are low?

hakre
  • 193,403
  • 52
  • 435
  • 836
mithilatw
  • 908
  • 1
  • 10
  • 29
  • 5
    You can write a benchmark script. Loop over the functions a few million times and see what is the fastest... – Green Black Jan 28 '13 at 17:59
  • 2
    Saying "PHP's arrays don't perform very well" is like saying "cars aren't very fast". **Context** is pretty important. What is fast for others may be slow for you, and visa-versa, and in practice they're almost certainly fast enough for you needs. – user229044 Jan 29 '13 at 04:08

4 Answers4

6

Write a benchmark script.

In general though, which variant to pick should hardly ever depend on performance. Especially in super trivial cases where your input data is very very small (say <10).

This most important criteria is always readability.

Only start optimizing code when there is an undeniable performance problem.

Premature optimization is the root of all evil.

Community
  • 1
  • 1
Halcyon
  • 57,230
  • 10
  • 89
  • 128
3

For a high number of values, I wouldn't use either method. I would create an associative array whose keys were the possible values, and use isset():

$test_array = array_flip(array('a', 'b', 'c', 'd', ...));
if (isset($test_array[$x])) ...

This has one-time O(n) cost to create $test_array, then checking for a match is O(1).

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • `Warning: array_key_exists() expects parameter 2 to be array` However, function call overhead too much in PHP. – जलजनक Jan 28 '13 at 18:58
  • Fixed the order of arguments. `in_array()` has the same function call overhead, and I doubt this would be the bottleneck with a large number of values. – Barmar Jan 28 '13 at 19:04
  • instead of calling `array_key_exists()` just access the `$test_array[$x]`. Wasn't it the whole purpose of `array_flip()`? – जलजनक Jan 28 '13 at 19:09
  • @SparKot Accessing the element will warn about nonexistent index whenever it's not found. The cost of that warning probably negates the saved function call overhead. – Barmar Jan 28 '13 at 19:14
  • 1
    `isset` would be faster than `array_key_exists`. And it will not raise "undefined index" warning. – Purple Coder Jan 29 '13 at 09:18
  • @PurpleCoder True, but you need to ensure that none of the values map to 0, "", or false. Unfortunately, the simple `array_flip()` that I used above won't ensure this, because `"a" => 0`. – Barmar Jan 29 '13 at 09:35
  • `isset` returns false if the key does not exist or the corresponding value is `NULL`. For existing key with a falsy except NULL it returns true. – Purple Coder Jan 29 '13 at 10:29
3

It depends on the PHP version you are using. On PHP 5.3 in_array() will be slower. But in PHP 5.4 or higher in_array() will be faster.

Only if you think the condition will grow over time or this condition should be dynamic, use in_array().

I did a benchmark. Loop your conditions 10,000 times.

Result for PHP 5.3.10

+----------------------------+---------------------------+
| Script/Task name           | Execution time in seconds |
+----------------------------+---------------------------+
| best case in_array()       | 1.746                     |
| best case logical or       | 0.004                     |
| worst case in_array()      | 1.749                     |
| worst case logical or      | 0.016                     |
| in_array_vs_logical_or.php | 3.542                     |
+----------------------------+---------------------------+

Result of PHP 5.4

+----------------------------+---------------------------+
| Script/Task name           | Execution time in seconds |
+----------------------------+---------------------------+
| best case in_array()       | 0.002                     |
| best case logical or       | 0.002                     |
| worst case in_array()      | 0.008                     |
| worst case logical or      | 0.010                     |
| in_array_vs_logical_or.php | 0.024                     |
+----------------------------+---------------------------+

Best case: match on first element.
Worst case: match on last element.

This is the code.

$loop=10000;
$cases = array('best case'=> 'a', 'worst case'=> 'z');
foreach($cases as $case => $x){
    $a = utime();
    for($i=0;$i<$loop; $i++){
        $result = ($x == 'a' || $x == 'b' || $x == 'c' || $x == 'd' || $x == 'e' || $x == 'f' || $x == 'g' || $x == 'h' || $x == 'i' || $x == 'j' || $x == 'k' || $x == 'l' || $x == 'm' || $x == 'n' || $x == 'o' || $x == 'p' || $x == 'q' || $x == 'r' || $x == 's' || $x == 't' || $x == 'u' || $x == 'v' || $x == 'w' || $x == 'x' || $x == 'y' || $x == 'z');
    }
    $b = utime();
    $ar = array('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z');
    for($i=0;$i<$loop; $i++){
        $result = in_array($x, $ar);
    }
    $c = utime();

    $Table->addRow(array("$case in_array()", number_format($c-$b, 3)));
    $Table->addRow(array("$case logical or", number_format($b-$a, 3)));
}

Here is utime() is a wrapper of microtime() that provides microseconds in float and $Table is a Console_Table instance.

Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
  • what if the number of comparisons to be made is greater than 50 or say 100? – जलजनक Jan 28 '13 at 19:21
  • @SparKot in_array will become too slow. Read the first paragraph of my answer. – Shiplu Mokaddim Jan 28 '13 at 19:25
  • 2
    **This answer is completely wrong**. *"Obviously in_array() will be slower as it has an extra function call overhead."* No, your `in_array` case is slower because you're declaring a massive array each time. The `array ('a', 'b', ... 'z')` is responsible for your numbers, **not** the call to `in_array`. Your benchmark is *hilariously* broken, and your conclusions are completely incorrect. – user229044 Jan 29 '13 at 04:18
  • @SparKot Please don't follow this answer's advice. If you actually write the benchmark properly, the `in_array` case is **faster**. `in_array` is *compiled C code*. Telling people that "function calls" are slower because of "function call overhead" is totally wrong and awful advice. Calling *built-in functions* is *always* preferable to rewriting similar functionality in PHP. Try the above code, but use `$arr = array('a', 'b', ..., 'z')` outside the loop and `$result = in_array($x, $arr)`. The results will be, at worst, nearly identical to the first case. – user229044 Jan 29 '13 at 04:26
  • @meagar thanks for pointing out some other problem. Let me benchmark it again. – Shiplu Mokaddim Jan 29 '13 at 07:18
  • 2
    @meagar I did the benchmark as you told. But the result does not change much. So it seems **your suggestion in the comment is completely wrong**. – Shiplu Mokaddim Jan 29 '13 at 07:23
  • 1
    @meagar I agree built in functions are always preferable. But they are preferable in compare to PHP written function. And here we are comparing a *language construct* with a built-in function. language construct will always be faster. They directly compiled to C statements. And built-in functions are compiled to C function. – Shiplu Mokaddim Jan 29 '13 at 07:26
  • **You're still doing it wrong**. You need to move `array(...)` *fully outside the loop*. The vast, vast majority of the time spent in the second case is spent creating that array, it has *nothing to do with the function call*. Move `$ar = array('a', 'b', 'c', ..., )` **to the top of the loop, outside the second case, out from between $b and $c**. Put it directly under the line `$loop = 10000`. – user229044 Jan 29 '13 at 13:58
  • And you're still wrong. We're not comparing them to a language construct, we're comparing them to 26 instances of a language construct. The fact that they're on one line doesn't magically make them faster. PHP still has to parse, compile and evaluate 26 variables and 25 `||`'s and churn through them, where `in_array` drops down to low-level C code. You're reimplementing an algorithm in PHP with `||`'s and calling it a "language construct", but it's not; it's an algorithm implemented in PHP when a built-in equivalent exists. – user229044 Jan 29 '13 at 13:59
  • I showed 26 language construct to create an worst case. Nothing else. I already told in the answer `if condition grows over time or its dynamic in_array() is convenient`. To make a benchmark I just added 26th constructs. 2-3 edits ago it was only 4 language construct. – Shiplu Mokaddim Jan 29 '13 at 14:48
  • If number of condition is too you can do it with language construct. No need to use in_array. If number of condition is high its better to use `in_array()`. writing every thing in one single line with lots of language construct does not make sense. At one end `||` **should be used** and on other end `in_array()` **should be used**. But in any case `||` solution will be faster. **Prove it wrong if you can** – Shiplu Mokaddim Jan 29 '13 at 14:52
  • If you actually rewrite your code correctly so that you're benchmarking the right thing, you'll see that **your own code proves you wrong**. `in_array() worst case 0.008399; logical or worst case 0.009269`. You're still benchmarking how fast PHP can allocate an array with `array(....)`. You're *benchmarking the wrong thing*. – user229044 Jan 29 '13 at 15:22
  • I've told you how to do this. Move the array allocation out of your second test case. – user229044 Jan 29 '13 at 15:26
  • You told me to move the array allocation out of the `loop`. I have done that. And no difference in result. – Shiplu Mokaddim Jan 29 '13 at 15:29
  • Found the problem. Its php `5.3` vs `5.4` issue. Will update it soon – Shiplu Mokaddim Jan 29 '13 at 15:41
  • Still wrong. I'm using PHP 5.3, and `in_array` is the same speed, or faster in the middle/worse case. But it's not about PHP versions, your advice is still fundamentally wrong. You should *never* avoid calling a function because of feared "function call overhead". – user229044 Jan 29 '13 at 16:39
  • The real take-away is this: PHP 5.4 is faster than 5.3, especially when compiled on your system with aggressive optimizations. Versions shipped with OS are slower. – Levi Morrison Jan 29 '13 at 16:40
  • @meagar I have tested this same script on 3 other system. PHP 5.3 performs slower. – Shiplu Mokaddim Jan 29 '13 at 16:41
  • And yet, here I am, with evidence that directly contradicts your assertion. In PHP 5.3.15 and 5.2.17 there is no appreciable difference. *Regardless*, the fact remains that avoiding invocation of built-in functions for performance reasons is terrible advice to be handing out. – user229044 Jan 29 '13 at 16:46
  • @LeviMorrison The real take-away is that benchmarking two fundamentally different pieces of code is a fool's errand, and that saying "`in_array` is slower than logical or-ing" is like saying that apples are faster than bananas. – user229044 Jan 29 '13 at 16:58
  • @meagar . . . and you are spending so much energy trying to prove a fool's errand to be just that. I spend my living tuning code for speed optimization and comparing to fundamentally different pieces of code that accomplish the same goal is *critical* to success. Is this particular user's goal pointless? Probably, but you can't extend that to ALL code everywhere . . . – Levi Morrison Jan 29 '13 at 18:58
  • I tried again, with some changes, with versions 5.3, 5.6 and 7.0.1. The `in_array` function won in versions 5.6 and 7.0.1. The version 7 showed the biggest difference. http://codepad.org/EmWVxweb – Vinicius Monteiro Oct 02 '18 at 17:32
0

Your first solution works pretty well in case of performance, when you dont need to change anything after, but readablity of the code is getting worse the more values you have to check.

While using array you can dynamically extend it if you need. Also it keeps your code clean.

As far as I know, in_array function has a pretty low performance compared to manual search with a loop.

Also, you can declare so called "map":

$actions = [
  "a" => function(){ action_a() ; },
  "b" => function(){ action_b() ; }
] ;

And after, you do like this:

if (isset($actions[$x])) 
  $action[$x]() ;
else
  do_smth() ;

A small tip: If you are using PHP >=5.4 you can declare a new array just like this:

$array = [1,2,3,4,5] ;
$array[] = "I am a new value to push" ;
sybear
  • 7,837
  • 1
  • 22
  • 38