2

Will (or can?) a switch statement based on a template argument be removed by the compiler?

See for example the following function in which the template argument funcStep1 is used to select the appropriate function. Will this switch statement be removed as its argument is known at compile time? I tried to learn it from the assembly code (given below) but I have no experience in reading assembly yet so this is not really a feasible task for me.

This question became now relevant for me since the newly introduced if constexpr provides an alternative that is guaranteed to be evaluated at compile time.

template<int funcStep1>
double Mdp::expectedCost(int sdx0, int x0, int x1, int adx0, int adx1) const
{
  switch (funcStep1)
  {
    case 1:
    case 3:
      return expectedCost_exact(sdx0, x0, x1, adx0, adx1);

    case 2:
    case 4:
      return expectedCost_approx(sdx0, x0, x1, adx0, adx1);

    default:
      throw string("Unknown value (Mdp::expectedCost.cc)");
  }
}

// Define all the template types we need
template double Mdp::expectedCost<1>(int, int, int, int, int) const;
template double Mdp::expectedCost<2>(int, int, int, int, int) const;
template double Mdp::expectedCost<3>(int, int, int, int, int) const;
template double Mdp::expectedCost<4>(int, int, int, int, int) const;

Here you find the output of 'objdump -D' when the above function is compiled with gcc -O2 -ffunction-sections:

1expectedCost.o:     file format elf64-x86-64


Disassembly of section .group:

0000000000000000 <.group>:
   0:   01 00                   add    %eax,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   08 00                   or     %al,(%rax)
   6:   00 00                   add    %al,(%rax)
   8:   09 00                   or     %eax,(%rax)
    ...

Disassembly of section .group:

0000000000000000 <.group>:
   0:   01 00                   add    %eax,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   0a 00                   or     (%rax),%al
   6:   00 00                   add    %al,(%rax)
   8:   0b 00                   or     (%rax),%eax
    ...

Disassembly of section .group:

0000000000000000 <.group>:
   0:   01 00                   add    %eax,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   0c 00                   or     $0x0,%al
   6:   00 00                   add    %al,(%rax)
   8:   0d                      .byte 0xd
   9:   00 00                   add    %al,(%rax)
    ...

Disassembly of section .group:

0000000000000000 <.group>:
   0:   01 00                   add    %eax,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   0e                      (bad)  
   5:   00 00                   add    %al,(%rax)
   7:   00 0f                   add    %cl,(%rdi)
   9:   00 00                   add    %al,(%rax)
    ...

Disassembly of section .bss:

0000000000000000 <_ZStL8__ioinit>:
    ...

Disassembly of section .text._ZNK3Mdp12expectedCostILi1EEEdiiiii:

0000000000000000 <_ZNK3Mdp12expectedCostILi1EEEdiiiii>:
   0:   e9 00 00 00 00          jmpq   5 <_ZNK3Mdp12expectedCostILi1EEEdiiiii+0x5>

Disassembly of section .text._ZNK3Mdp12expectedCostILi2EEEdiiiii:

0000000000000000 <_ZNK3Mdp12expectedCostILi2EEEdiiiii>:
   0:   e9 00 00 00 00          jmpq   5 <_ZNK3Mdp12expectedCostILi2EEEdiiiii+0x5>

Disassembly of section .text._ZNK3Mdp12expectedCostILi3EEEdiiiii:

0000000000000000 <_ZNK3Mdp12expectedCostILi3EEEdiiiii>:
   0:   e9 00 00 00 00          jmpq   5 <_ZNK3Mdp12expectedCostILi3EEEdiiiii+0x5>

Disassembly of section .text._ZNK3Mdp12expectedCostILi4EEEdiiiii:

0000000000000000 <_ZNK3Mdp12expectedCostILi4EEEdiiiii>:
   0:   e9 00 00 00 00          jmpq   5 <_ZNK3Mdp12expectedCostILi4EEEdiiiii+0x5>

Disassembly of section .text.startup._GLOBAL__sub_I_expectedCost.cc:

0000000000000000 <_GLOBAL__sub_I_expectedCost.cc>:
   0:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 7 <_GLOBAL__sub_I_expectedCost.cc+0x7>
   7:   48 83 ec 08             sub    $0x8,%rsp
   b:   e8 00 00 00 00          callq  10 <_GLOBAL__sub_I_expectedCost.cc+0x10>
  10:   48 8b 3d 00 00 00 00    mov    0x0(%rip),%rdi        # 17 <_GLOBAL__sub_I_expectedCost.cc+0x17>
  17:   48 8d 15 00 00 00 00    lea    0x0(%rip),%rdx        # 1e <_GLOBAL__sub_I_expectedCost.cc+0x1e>
  1e:   48 8d 35 00 00 00 00    lea    0x0(%rip),%rsi        # 25 <_GLOBAL__sub_I_expectedCost.cc+0x25>
  25:   48 83 c4 08             add    $0x8,%rsp
  29:   e9 00 00 00 00          jmpq   2e <_GLOBAL__sub_I_expectedCost.cc+0x2e>

Disassembly of section .init_array:

0000000000000000 <.init_array>:
    ...

Disassembly of section .comment:

0000000000000000 <.comment>:
   0:   00 47 43                add    %al,0x43(%rdi)
   3:   43 3a 20                rex.XB cmp (%r8),%spl
   6:   28 55 62                sub    %dl,0x62(%rbp)
   9:   75 6e                   jne    79 <_ZStL8__ioinit+0x79>
   b:   74 75                   je     82 <_ZStL8__ioinit+0x82>
   d:   20 37                   and    %dh,(%rdi)
   f:   2e 33 2e                xor    %cs:(%rsi),%ebp
  12:   30 2d 32 37 75 62       xor    %ch,0x62753732(%rip)        # 6275374a <_ZStL8__ioinit+0x6275374a>
  18:   75 6e                   jne    88 <_ZStL8__ioinit+0x88>
  1a:   74 75                   je     91 <_ZStL8__ioinit+0x91>
  1c:   31 7e 31                xor    %edi,0x31(%rsi)
  1f:   38 2e                   cmp    %ch,(%rsi)
  21:   30 34 29                xor    %dh,(%rcx,%rbp,1)
  24:   20 37                   and    %dh,(%rdi)
  26:   2e 33 2e                xor    %cs:(%rsi),%ebp
  29:   30 00                   xor    %al,(%rax)

Disassembly of section .eh_frame:

0000000000000000 <.eh_frame>:
   0:   14 00                   adc    $0x0,%al
   2:   00 00                   add    %al,(%rax)
   4:   00 00                   add    %al,(%rax)
   6:   00 00                   add    %al,(%rax)
   8:   01 7a 52                add    %edi,0x52(%rdx)
   b:   00 01                   add    %al,(%rcx)
   d:   78 10                   js     1f <.eh_frame+0x1f>
   f:   01 1b                   add    %ebx,(%rbx)
  11:   0c 07                   or     $0x7,%al
  13:   08 90 01 00 00 10       or     %dl,0x10000001(%rax)
  19:   00 00                   add    %al,(%rax)
  1b:   00 1c 00                add    %bl,(%rax,%rax,1)
  1e:   00 00                   add    %al,(%rax)
  20:   00 00                   add    %al,(%rax)
  22:   00 00                   add    %al,(%rax)
  24:   05 00 00 00 00          add    $0x0,%eax
  29:   00 00                   add    %al,(%rax)
  2b:   00 10                   add    %dl,(%rax)
  2d:   00 00                   add    %al,(%rax)
  2f:   00 30                   add    %dh,(%rax)
  31:   00 00                   add    %al,(%rax)
  33:   00 00                   add    %al,(%rax)
  35:   00 00                   add    %al,(%rax)
  37:   00 05 00 00 00 00       add    %al,0x0(%rip)        # 3d <.eh_frame+0x3d>
  3d:   00 00                   add    %al,(%rax)
  3f:   00 10                   add    %dl,(%rax)
  41:   00 00                   add    %al,(%rax)
  43:   00 44 00 00             add    %al,0x0(%rax,%rax,1)
  47:   00 00                   add    %al,(%rax)
  49:   00 00                   add    %al,(%rax)
  4b:   00 05 00 00 00 00       add    %al,0x0(%rip)        # 51 <.eh_frame+0x51>
  51:   00 00                   add    %al,(%rax)
  53:   00 10                   add    %dl,(%rax)
  55:   00 00                   add    %al,(%rax)
  57:   00 58 00                add    %bl,0x0(%rax)
  5a:   00 00                   add    %al,(%rax)
  5c:   00 00                   add    %al,(%rax)
  5e:   00 00                   add    %al,(%rax)
  60:   05 00 00 00 00          add    $0x0,%eax
  65:   00 00                   add    %al,(%rax)
  67:   00 14 00                add    %dl,(%rax,%rax,1)
  6a:   00 00                   add    %al,(%rax)
  6c:   6c                      insb   (%dx),%es:(%rdi)
  6d:   00 00                   add    %al,(%rax)
  6f:   00 00                   add    %al,(%rax)
  71:   00 00                   add    %al,(%rax)
  73:   00 2e                   add    %ch,(%rsi)
  75:   00 00                   add    %al,(%rax)
  77:   00 00                   add    %al,(%rax)
  79:   4b 0e                   rex.WXB (bad) 
  7b:   10 5e 0e                adc    %bl,0xe(%rsi)
  7e:   08 00                   or     %al,(%rax)

3 Answers3

4

Yes, this is optimized. There are a few things that make reading assembly easier, such as demangling names (example: _ZNK3Mdp12expectedCostILi1EEEdiiiii is the mangled form of double Mdp::expectedCost<1>(int, int, int, int, int) const), stripping comments and text (and using Intel syntax):

double expectedCost<1>(int, int, int, int, int):           # @double expectedCost<1>(int, int, int, int, int)
        jmp     expectedCost_exact(int, int, int, int, int) # TAILCALL
double expectedCost<2>(int, int, int, int, int):           # @double expectedCost<2>(int, int, int, int, int)
        jmp     expectedCost_approx(int, int, int, int, int) # TAILCALL
double expectedCost<3>(int, int, int, int, int):           # @double expectedCost<3>(int, int, int, int, int)
        jmp     expectedCost_exact(int, int, int, int, int) # TAILCALL
double expectedCost<4>(int, int, int, int, int):           # @double expectedCost<4>(int, int, int, int, int)
        jmp     expectedCost_approx(int, int, int, int, int) # TAILCALL

https://godbolt.org/z/ZtoKFH

The above site simplifies this whole process for you.

In this case I didn't provide definitions for expectedCost_approx so the compiler just leaves a jump. But in any case, compilers are definitely smart enough to realize that each template function has a constant value in the switch.

Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • That website is excellent, I already almost gave up hope of ever understanding assembly! – Michiel uit het Broek Apr 17 '19 at 11:44
  • 2
    @Michiel it is indeed a very good website. And don't be intimidated by assembly. While writing good assembly is a whole different story, reading assembly a compiler wrote for you is by far not as hard as it may sound. In fact, it's actually quite simple, especially with the tooltips that tell you what each instruction does, just takes some time to get used to. It's a very important skill if you care about performance at all, which you probably do since you're using C++ ;)… – Michael Kenzel Apr 17 '19 at 11:48
3

The answer to your question is: Yes, any moderately useful compiler will perform dead code elimination.

if constexpr is not so much about forcing compile time evaluation for reasons of performance. In terms of performance, there's not really going to be any difference between if constexpr and a normal if when the given expression is a compile-time constant because compilers will end up optimizing the unused branch away either way. What if constexpr enables is to have code in the inactive branch that must not be instantiated with the given template arguments (e.g., because it would be invalid in that particular case). For your switch above, the whole code will be instantiated for all cases. Only afterwards will the unused code be removed by the optimizer. if constexpr on the other hand, guarantees that the code in the unused branch will never be instantiated to begin with. See, e.g., here for more on that…

Michael Kenzel
  • 15,508
  • 2
  • 30
  • 39
1

We don't have switch constexpr, and there are no guaranties of branch elimination for simple switch, even with constexpr value (as for regular if in fact), but I expect than compiler would remove them with proper optimization flag.

Notice also that your not-used branches would instantiate, if any, template methods/objects whereas if constexpr would not.

So if you want to have guaranty that only relevant code is there, or avoid unneeded instantiations, use if constexpr. Else use the one you find the clearer.

Jarod42
  • 203,559
  • 14
  • 181
  • 302