7

This is something I've been wondering for a while. Is performance always better when using native PHP functions than their PHP loop equivalents? Why or why not?

Here are two examples to illustrate the question:

  • Say I have a large array with 1000 elements. Each value is just an integer userid. I want to see if a particular userid is in the array, and I happen to know it will be towards the end of the array (as it would have been added recently). I have two options: a) do a regular PHP foreach loop that goes through the whole array until it finds the userid, or b) do an array_reverse() on the array and do the same foreach loop until it finds the id (if it exists, it will end this loop sooner). Which is faster?

My gut tells me the first option is faster since it does one loop, and the second option is slower because behind the scenes, array_reverse() is also doing some kind of loop (in C), thereby requiring two loops for the operation.

  • If I have a large multidimensional array where each value is [messageid, message], will it be slower to use a foreach loop to find a particular message by id, than it is to set the messageid as the key for the element and doing isset(array[messageid]) && array[messageid]?

Edit: just to clarify, I'm aware the first example can use array_search(), since this is a very simplified example. The main question isn't which one is faster (as benchmarks can let me know easily), but why, as in, what's going on behind the scenes?

timetofly
  • 2,957
  • 6
  • 36
  • 76
  • 4
    "I want to see if a particular userid is in the array" - use `array_search()` or `in_array()`, which will be faster than scanning the loop with `foreach()` in either direction. However, I should think that reversing and finding it quickly might be faster than finding it slowly under certain array sizes. – halfer Sep 04 '14 at 18:31
  • 3
    Don't ask questions like these, instead perform your own benchmarks and profiling, then publish your results. – Dai Sep 04 '14 at 18:32
  • This cannot be *generally* answered, since everything is "native PHP" in some way. In your first example, `array_reverse` is a somewhat large operation, but doing it *may* be faster than a `foreach` loop with some additional comparison. Of course, `for ($i = count($arr) - 1; $i >= 0; $i--)` will likely be even faster. In your second case accessing an element by index will be orders of magnitude faster than looping, they're entirely different classes of operation. – deceze Sep 04 '14 at 18:33
  • What @halfer said. Also, if the array is sorted then implementing a simple [binary search](http://en.wikipedia.org/wiki/Binary_search_algorithm) will also be faster than scanning in either direction. I do not know why you think searching the array backwards would be any faster. – Sammitch Sep 04 '14 at 18:33
  • @halfer so I'm wondering why `array_search()` would be faster than using a foreach loop? Behind the scenes in the PHP source code (in C), aren't they doing the same thing? – timetofly Sep 04 '14 at 18:34
  • language constructs vs api functions – Ryan Sep 04 '14 at 18:35
  • No, behind the scenes is C code compiled down to machine code - once it gets to that level, there is no (slow) interpretation step. – halfer Sep 04 '14 at 18:35
  • 2
    @Dai he is not asking whether it is faster, he is asking why it is. Benchmarking won't tell you that. – nico Sep 04 '14 at 18:36
  • The "behind the scenes" code can be optimised a lot more than your PHP code could. Not only in terms of direct memory access etc. possible in C, but also in terms of "purpose built". A `foreach` loop is a general purpose tool, a specific array search algorithm may be completely different. – deceze Sep 04 '14 at 18:36
  • @halfer I think that's exactly the answer I was looking for. PHP first has to compile the code it runs into machine code and then run it. But doesn't the compilation run only once, when the script is initialized? – timetofly Sep 04 '14 at 18:37
  • I believe PHP does two compilation steps - first into bytecode (a sort of virtual machine code) and then into native machine code. If you have an op-code cache the first step can be skipped in some cases. The detail of what gets compiled and run when is outside my knowledge, but the broad answer is calling PHP functions is generally faster than writing your own in PHP. – halfer Sep 04 '14 at 18:38
  • Even worse, sometimes, using a combination of built-in function is better than the dedicated built-in function : http://stackoverflow.com/questions/8321620/array-unique-vs-array-flip IMO, such concerns should be adressed case by case only when you have huge performance issues though (or if you already know a faster way to do the job of course) – Clément Malet Sep 04 '14 at 18:39
  • @ClémentMalet that's not necessarily because one algorithm is better than the other, but because `array_flip()` doesn't care about duplicate keys. – Sammitch Sep 04 '14 at 18:45
  • 3
    the larger your data set, the less important the overhead of the programming language becomes and the more important the efficiency of your algorithm is. so for smaller datasets an algorithm with more operations (i.e. 2 loops in your example) maybe faster using native php but for larger datasets it will usually come down to the algorithm and not whether it's using native functions – FuzzyTree Sep 04 '14 at 18:46
  • @Sammitch Right, it was just an example with built-in versus built-in, we're not gonna list all the examples and use cases. ;) – Clément Malet Sep 04 '14 at 18:48

2 Answers2

1

You can see PHP benchmarks of time consumed by functions here, in my opinion they are not terribly different. Why? More code or less code, less code or more basic comparison functions take less floating point operations than an actual function, then again those functions operate in basic comparison methods like array_reverse() which are not terribly time-consuming and processing required.

http://www.phpbench.com/

EDIT: I agree with FuzzyTree in the fact that you should focus on the efficiency of the algorithm and not the functions themselves.

Carlos
  • 57
  • 1
  • 16
-2

array_xxx() functions are composed from loops and they are the most optimized as they can be. I don't think you can make any better on your own with your "handmade" foreach/for/while loops. So I prefer using predeclared functions.
(By the way, in the most primitive form of programming language there is only one kind of loop but we are able to modulate it into three forms :) )

  • There's some detail missing here. The `array_search` is faster as it is calling a native machine-code function, whereas the equivalent PHP has parsing and interpretation steps that will slow it down. That said, the array functions are probably optimised as they can be in PHP, but if a programmer needed more speed, they'd write the program in C themselves, and the result could well be faster. This is because a native C program probably does not need the complexity of PHP's internal array system. – halfer Sep 04 '14 at 18:50
  • @halfer I thought we are talking in the borders of PHP so why are you messing here a native C? Another one can say that you should use Assembler and make it even faster man. – Juraj Carnogursky Sep 04 '14 at 21:32