27

What's faster in PHP, making a large switch statement, or setting up an array and looking up the key?

Now before you answer, I am well aware that for pure lookups the array is faster. But, this is assuming creating the array just once, then looking it up repeatedly.

But that's not what I'm doing - each run through the code is new, and the array will be used just once each time. So all the array hashes need to be calculated fresh each time, and I'm wondering if doing that setup is slower than simply having a switch statement.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ariel
  • 25,995
  • 5
  • 59
  • 69
  • 9
    Try it. Write a PHP script to run through each method a few 100,000 times and print out the duration for each of the two methods. These kind of optimizations rarely make a significant difference in the long run though. – Rich Adams Jul 27 '11 at 23:59
  • 2
    "Does it even matter"? If there has been no performance analysis then -- no, it doesn't. Use whatever is more clear and maintainable. (It may be neither of the above). –  Jul 28 '11 at 00:00
  • 2
    These performance "optimizations" will not likely yield a lot of results in the long run, maybe a few milliseconds or so. When you actually deploy this in a large-scale environment, the computing power that you'll get will likely render these optimizations negligible. Just my opinion, though. Try it out and see. Run it through a stress test. Experiment is the way to get the answer :) – Jimmie Lin Jul 28 '11 at 00:15
  • 1
    I have not tried any performance analysis, but I would assume that the microoptimization would be to store the hashes so that the hashes do not need to be calculated fresh each time. – emory Jul 28 '11 at 00:20
  • Are you planning on writing your array/switch to a php file (according to how your test appears to work)? – Jared Farrish Jul 28 '11 at 00:25
  • 1
    @jared-farrish I could do my code two ways, so I was going to choose whichever is faster. (At least if there was a large difference.) – Ariel Jul 28 '11 at 00:28
  • What happens if you want to do something more complicated than echoing back the value? i.e. you may have to store a function reference in your array and call it, which I imagine would affect the speed a tad. You should also probably randomize the index accessed each time. The switch might be faster up to a point and will be even slower past that point. Either way, you're still going through 17,000,000 runs in less than 10 seconds, so the readability of the switch wins in my mind. – jswolf19 Jul 28 '11 at 03:18
  • If hashes need to be recalculated every time, that would mean `switch/case` condition need to be 'dynamic' too. How are you planning on implementing that? Or did I understand it wrong? – Mchl Jul 28 '11 at 04:00
  • **So I think I will ignore this optimization and go with whatever is easier** Do this for every other "optimization" you think of, and you'll be on the right track. These kind of micro optimization are absolutely useless and they'll sap your productivity for no measurable benefit in real-world code. – user229044 Jul 28 '11 at 04:06
  • 1
    @Mchl they are recalculated when php opens and parses the file. – Ariel Jul 28 '11 at 04:10
  • @meagar But I didn't know in advance that it was a micro optimization. I was wondering if there would be a huge difference. – Ariel Jul 28 '11 at 04:12
  • @Ariel Optimizing algorithms yields noticeable performance increases. Optimizing syntax is a waste of time; use what is readable and maintainable. – user229044 Jul 28 '11 at 04:14
  • @meagar, There's no guarantee that both solutions are optimized and none of them are performing in a [degraded O(n) way](http://en.wikipedia.org/wiki/Joel_Spolsky#Schlemiel_the_Painter.27s_algorithm). This can be an algorithmic optimization, not a syntax optimization. – Pacerier Mar 05 '15 at 19:20

2 Answers2

17

I did some tests:

File array_gen.php

<?
    echo '<?
        $a = 432;
        $hash = array(
    ';

    for($i = 0; $i < 10000; $i++)
        echo "$i => $i,\n";

    echo ');
        echo $hash[$a];
    ';

File switch_gen.php:

<?
    echo '<?
        $a = 432;
        switch($a) {
    ';
    for($i = 0; $i < 10000; $i++)
        echo "case $i: echo $i; break;\n";

    echo '}';

Then:

php array_gen.php > array_.php
php switch_gen.php > switch.php

time tcsh -c 'repeat 1000 php array.php > /dev/null'
19.297u 4.791s 0:25.16 95.7%
time tcsh -c 'repeat 1000 php switch.php > /dev/null'
25.081u 5.543s 0:31.66 96.7%

Then I modified the loop to:

for($i = 'a'; $i < 'z'; $i++)
  for($j = 'a'; $j < 'z'; $j++)
    for($k = 'a'; $k < 'z'; $k++)

To create 17576, 3 letter combinations.

time tcsh -c 'repeat 1000 php array.php > /dev/null'
30.916u 5.831s 0:37.85 97.0%
time tcsh -c 'repeat 1000 php switch.php > /dev/null'
36.257u 6.624s 0:43.96 97.5%

The array method wins every time, even once you include setup time. But not by a lot. So I think I will ignore this optimization and go with whatever is easier.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ariel
  • 25,995
  • 5
  • 59
  • 69
  • 6
    6 second difference for 10,000 iterations. So that's a 0.0006 gain for one iteration. Use whatever is easier to program. This is *almost* always the correct thing to do by the way, since better hardware is almost always cheaper than the time you'll spend uber-optimizing stuff (Which usually leaves the code less readable == less maintainable == more work to change it == even more costs of optimisation). – Martin Tournoij Jul 31 '11 at 08:25
  • 2
    @Ariel, -1. Which PHP engine (version)? Which OS? If you want to do tests, do them **elaborately**. The results of half-hearted tests are **harmful**. They cannot be trusted and their conclusion should be taken with a huge grain of salt. The question is still open. – Pacerier Mar 05 '15 at 19:29
  • What are the four columns in the output of `time` and what are units? They do not correspond to [the man page](https://linux.die.net/man/1/time) (*(i) the elapsed real time, (ii) the user CPU time, and (iii) the system CPU time*). Which of the four columns are reasonable to use comparison and why? – Peter Mortensen Feb 02 '20 at 21:41
  • @PeterMortensen They are User CPU Time, System CPU Time, Elapsed Time, Percent of CPU used. Use either the first column or the second column to compare. – Ariel Feb 03 '20 at 00:15
6

It sort of depends on the array size, but for most practical purposes, you can consider that the array is faster. The reason is simple; a switch statement must compare sequentially against each entry in the switch statement, but the array approach simply takes the hash and finds that entry. When you have so few entries in your switch that the sequential comparisons are faster than the hashing, it's faster to use a switch, but the array approach becomes more efficient quickly. In computer science terms, it's a question of O(n) vs. O(1).

Paul Sonier
  • 38,903
  • 3
  • 77
  • 117
  • 4
    My question was also about the speed of creating the array in the first place, since the array is used just once. – Ariel Jul 28 '11 at 00:11
  • @Ariel: yes, I understand that; the array creation involves hashing the key to insert to the array; the hash to insert the key and the hash to look up a key are both O(1). – Paul Sonier Jul 28 '11 at 04:18
  • The hash to insert the key is O(n), not O(1) since you have to insert all of them before you can lookup any. But the switch (on average) is O(n/2). – Ariel Jul 28 '11 at 06:00
  • 2
    @PaulSonier, Why does a switch statement need to compare sequentially? That *may* be what Zend engine's implementation is doing right now, but there is no such theoretical limitation and in the future it may well be that they do **not** compare sequentially. – Pacerier Mar 05 '15 at 19:17
  • @Pacerier, well, this wouldn't be easy without breaking BC. There is also this weird switch(TRUE)-Syntax which assumes that it will be called sequentially (`switch(TRUE) { case ($a == $b): return 1; case ($a == 0): return 2; }`) – giraff Sep 01 '16 at 17:03