This is not an answer but the format of its content cannot be provided by a comment. It also cannot stay in my answer as it is technically not part of it.
I generated a benchmark for the three solutions provided by @deceze and my solution and ran it using PHP 7.0. Everything below applies only to PHP 7.x.
PHP 5 runs much slower and it requires more memory.
I started by running the code 1,000,000
times over a small list of 100
items then I iteratively divided the number of iteration by 10
while multiplied the list length by 10
.
Here are the results:
$ php bench.php 100 1000000
Generating 100 elements... Done. Time: 0.000112 seconds.
array_filter(): 3.265538 seconds/1000000 iterations. 0.000003 seconds/iteration.
foreach : 3.771463 seconds/1000000 iterations. 0.000004 seconds/iteration.
reduce @deceze: 6.869162 seconds/1000000 iterations. 0.000007 seconds/iteration.
reduce @axiac : 8.599051 seconds/1000000 iterations. 0.000009 seconds/iteration.
$ php bench.php 1000 100000
Generating 1000 elements... Done. Time: 0.000750 seconds.
array_filter(): 3.024423 seconds/100000 iterations. 0.000030 seconds/iteration.
foreach : 3.997505 seconds/100000 iterations. 0.000040 seconds/iteration.
reduce @deceze: 6.669426 seconds/100000 iterations. 0.000067 seconds/iteration.
reduce @axiac : 8.342756 seconds/100000 iterations. 0.000083 seconds/iteration.
$ php bench.php 10000 10000
Generating 10000 elements... Done. Time: 0.002643 seconds.
array_filter(): 2.913948 seconds/10000 iterations. 0.000291 seconds/iteration.
foreach : 4.190049 seconds/10000 iterations. 0.000419 seconds/iteration.
reduce @deceze: 9.649768 seconds/10000 iterations. 0.000965 seconds/iteration.
reduce @axiac : 11.236113 seconds/10000 iterations. 0.001124 seconds/iteration.
$ php bench.php 100000 1000
Generating 100000 elements... Done. Time: 0.042237 seconds.
array_filter(): 90.369577 seconds/1000 iterations. 0.090370 seconds/iteration.
foreach : 15.487466 seconds/1000 iterations. 0.015487 seconds/iteration.
reduce @deceze: 19.896064 seconds/1000 iterations. 0.019896 seconds/iteration.
reduce @axiac : 15.056250 seconds/1000 iterations. 0.015056 seconds/iteration.
For lists up to about 10,000
elements, the results are consistent and they match the expectations: array_filter()
is the fastest, foreach
comes close then the array_reduce()
solutions aligned by the number of functions they call (@deceze's is faster as it doesn't call any function, mine's calls min()
once). Even the total running time feels consistent.
The value of 90
seconds for the array_filter()
solution for 100,000
items in the list looks out of place but it has a simple explanation: both array_filter()
and array_column()
generate new arrays. They allocate memory and copy the data and this takes time. Add the time needed by the garbage collector to free all the small memory blocks used by a list of 10,000
small arrays and the running time will go up faster.
Another interesting result for 100,000
items array is that my solution using array_reduce()
is as fast as the foreach
solution and better than @deceze's solution using array_reduce()
. I don't have an explanation for this result.
I tried to find out some thresholds when these things start to happen. For this I ran the benchmark with different list sizes, starting from 5,000
and increasing the size by 1,000
while keeping the total number of visited items to 100,000,000
. The results can be found here.
The results are surprising. For some sizes of the list (8,000
, 11,000
, 12,000
, 13,000
, 17,000
items), the array_filter()
solution needs about 10 times more time to complete than any solution that uses array_reduce()
. For other list sizes, however, it goes back to the track and completes the 100 million node visits in about 3 seconds while the time needed by the other solutions constantly increases as the list length increases.
I suspect the culprit for the hops in the time needed by the array_filter()
solution is the PHP's memory allocation strategy. For some lengths of the initial array, the temporary arrays returned by array_column()
and array_filter()
probably trigger more memory allocation and garbage cleanup cycles than for other sizes. Of course, it is possible that the same behaviour happens on other sizes I didn't test.
Somewhere around 16,000...17,000
items in the list, my solution starts running faster than @deceze's solution using array_reduce()
and around 25.000
it starts performing equally fast as the foreach
solution (and even faster sometimes).
Also for lists longer than 16,000
-17,000
items the array_filter()
solution consistently needs more time to complete than the others.
The benchmark code can be found here. Unfortunately it cannot be executed on 3v4l.org
for lists larger than 15,000
elements because it reaches the memory limit imposed by the system.
Its results for lists larger than 5,000
items can be found here.
The code was executed using PHP 7.0.20
CLI on Linux Mint 18.1. No APC or other kind of cache was involved.
Conclusion
For small lists, up to 5,000
items, use the array_filter(array_column())
solution as it performs well for this size of the list and it looks neat.
For lists larger than 5,000
items switch to the foreach
solution. It doesn't look well but it runs fast and it doesn't need extra memory. Stick to it as the list size increases.
For hackatons, interviews and to look smart to your colleagues, use any array_reduce()
solution. It shows your knowledge about PHP array functions and your understanding of the "callback" programming concept.