30

PMD tells me

A switch with less than 3 branches is inefficient, use a if statement instead.

Why is that? Why 3? How do they define efficiency?

James Raitsev
  • 92,517
  • 154
  • 335
  • 470
  • 3
    PMD scans Java source code and looks for potential problems like possible bugs, dead code, suboptimal code , overcomplicated expressions and duplicate code. (Hover over the tags) – Chords May 05 '12 at 04:00
  • 10
    It should also scan itself for grammar. "Less" should be "fewer." :) – yshavit May 05 '12 at 04:01
  • @jmort253 Updated question to include link – James Raitsev May 05 '12 at 04:04
  • 2
    @yshavit less is suitable and so is fewer, the connotation is enough in this instance. http://www.cracked.com/blog/7-commonly-corrected-grammar-errors-that-arent-mistakes/ – Mike McMahon May 05 '12 at 04:09
  • 5
    Don't push your descriptivist grammar on me, @MikeMcMahon! I like my grammar like I like my glaciers: old, frozen and moving as slowly as possible. – yshavit May 05 '12 at 04:17
  • 4
    obligatory "premature optimization is the root of all evil" etc – zzzzBov May 05 '12 at 06:44
  • 1
    I would use the Java structure that's the easiest to read, regardless of the efficiency. Chances are, if you have to optimize, it won't be in this section. Also, if you must have the performance, Java is the wrong language. – Tony Ennis May 05 '12 at 13:41

4 Answers4

39

Because a switch statement is compiled with two special JVM instructions that are lookupswitch and tableswitch. They are useful when working with a lot of cases but they cause an overhead when you have just few branches.

An if/else statement instead is compiled into typical je jne ... chains which are faster but require many more comparisons when used in a long chain of branches.

You can see the difference by looking at byte code, in any case I wouldn't worry about these issues, if anything could become a problem then JIT will take care of it.

Practical example:

switch (i)
{
  case 1: return "Foo";
  case 2: return "Baz";
  case 3: return "Bar";
  default: return null;
}

is compiled into:

L0
 LINENUMBER 21 L0
 ILOAD 1
 TABLESWITCH
   1: L1
   2: L2
   3: L3
   default: L4
L1
 LINENUMBER 23 L1
FRAME SAME
 LDC "Foo"
 ARETURN
L2
 LINENUMBER 24 L2
FRAME SAME
 LDC "Baz"
 ARETURN
L3
 LINENUMBER 25 L3
FRAME SAME
 LDC "Bar"
 ARETURN
L4
 LINENUMBER 26 L4
FRAME SAME
 ACONST_NULL
 ARETURN

While

if (i == 1)
  return "Foo";
else if (i == 2)
  return "Baz";
else if (i == 3)
  return "Bar";
else
  return null;

is compiled into

L0
 LINENUMBER 21 L0
 ILOAD 1
 ICONST_1
 IF_ICMPNE L1
L2
 LINENUMBER 22 L2
 LDC "Foo"
 ARETURN
L1
 LINENUMBER 23 L1
FRAME SAME
 ILOAD 1
 ICONST_2
 IF_ICMPNE L3
L4
 LINENUMBER 24 L4
 LDC "Baz"
 ARETURN
L3
 LINENUMBER 25 L3
FRAME SAME
 ILOAD 1
 ICONST_3
 IF_ICMPNE L5
L6
 LINENUMBER 26 L6
 LDC "Bar"
 ARETURN
L5
 LINENUMBER 28 L5
FRAME SAME
 ACONST_NULL
 ARETURN
Jack
  • 131,802
  • 30
  • 241
  • 343
  • Jack thank you. This is a wonderful answer. On a side note, what did you use to view `.class` files? – James Raitsev May 05 '12 at 04:17
  • It's a plugin for Eclipse, if I remember correctly it should be this one: http://andrei.gmxhome.de/bytecode/index.html – Jack May 05 '12 at 04:20
  • 1
    @JAM: I believe you can also use [javap](http://docs.oracle.com/javase/1.5.0/docs/tooldocs/windows/javap.html). – RanRag May 05 '12 at 04:21
  • 4
    And as we all know, bytecode is the only thing that determines performance - the jit would never change the code to get better performance! Another useless »performance« optimization – Voo May 05 '12 at 08:47
  • 3
    I would say it's a very poor optimizer that doesn't optimize 2-branch `switch` into `if` given that `if`'s performance is better. – Vlad May 05 '12 at 13:36
7

Although there are minor efficiency gains when using a switch compared to using an if-statement, those gains would be negligible under most circumstances. And any source code scanner worth its salt would recognize that micro-optimizations are secondary to code clarity.

They are saying that an if statement is both simpler to read and takes up fewer lines of code than a switch statement if the switch is significantly short.

From the PMD website:

TooFewBranchesForASwitchStatement: Switch statements are indended to be used to support complex branching behaviour. Using a switch for only a few cases is ill-advised, since switches are not as easy to understand as if-then statements. In these cases use theif-then statement to increase code readability.

Community
  • 1
  • 1
Tim Pote
  • 27,191
  • 6
  • 63
  • 65
  • 1
    Importance of clarity far exceeds any micro-optimizations. No doubt about that. – James Raitsev May 05 '12 at 04:08
  • 5
    The nice thing is that they state "inefficient" in the warning but they state that it is just related to "readability" in the documentation. Good coherency.. – Jack May 05 '12 at 04:17
6

Why is that?

Different sequences of instructions are used when the code is (finally) compiled to native code by the JIT compiler. A switch is implemented by a sequence of native instructions that perform a indirect branch. (The sequence typically loads an address from a table and then branches to that address.) An if / else is a implemented as instructions that evaluate the condition (probably a compare instruction) followed by a conditional branch instruction.

Why 3?

It is an empirical observation, I assume based on analysing the generated native code instructions and/or benchmarking. (Or possibly not. To be absolutely sure, you would need to ask the author(s) of that PMD rule how they derived that number.)

How do they define efficiency?

Time taken to execute the instructions.


I'd personally take issue with this rule ... or more precisely with the message. I think it should say that an if / else statement is simpler and more readable than a switch with 2 cases. The efficiency issue is secondary, and probably irrelevant.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

I believe it has to do with the way that a switch, and an if/else compiles down.

Say it takes 5 computations to process a switch statement. Say an if statement takes two computations. Less than 3 options in your switch would equal 4 computations in ifs vs 5 in switches. However, the overhead remains constant in a switch, so if it has 3 choices, ifs would be 3 * 2 processed, vs 5 still for the switch.

The gains when looking at millions of computations are extremely negligible. Its more a matter of "this is the better way to do it" rather than anything that might affect you. It would only do so on something that cycles on that function millions of times in a quite iteration.