6

Suppose I have a std::tuple:

std::tuple<int,int,int,int> t = {1,2,3,4};

and I want to use std::tie just for readability purpose like that:

int a, b, c, d; // in real context these names would be meaningful
std::tie(a, b, c, d) = t;

vs. just using t.get<int>(0), etc.

Would a GCC optimize the memory use of this tuple or would it allocate additional space for a, b, c, d variables?

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
syntagma
  • 23,346
  • 16
  • 78
  • 134
  • 4
    why not try it? (btw, it probably would.) – The Paramagnetic Croissant May 13 '15 at 10:51
  • Related [Do temp variables slow down my program?](http://stackoverflow.com/q/26949569/1708801) – Shafik Yaghmour May 13 '15 at 11:40
  • 1
    Check out [godbolt](https://gcc.godbolt.org/). Short answer: for a simple example, yes. Summing the elements in a tuple via `tie()` or `get()` produces identical assembly. – Barry May 13 '15 at 11:47
  • 2
    Is there some reason why a question with an empirical answer which can be independently determined needs to be submitted to an Internet Q&A site? – user207421 May 13 '15 at 12:09
  • @EJP I feel like this question has more dimensions then that, understanding what are valid optimizations around relatively new C++ constructs and whether even if the optimization is valid can one expect the compiler to perform that optimization. Tuples introduce some interesting trade-offs, it is somewhat outside the scope of this question but there are some cases more complicated cases where the compiler does not do the obvious optimization. – Shafik Yaghmour May 13 '15 at 12:14
  • I'm voting to close this question as off-topic because it is not about solving an actual problem but about speculation. – The Paramagnetic Croissant May 13 '15 at 15:40

1 Answers1

7

In this case I don't see any reason why not, under the as-if rule the compiler only has to emulate the observable behavior of the program. A quick experiment using godbolt:

#include <tuple>
#include <cstdio>

void func( int x1, int x2,int  x3, int x4)
{
  std::tuple<int,int,int,int> t{x1,x2,x3,x4};

  int a, b, c, d; // in real context these names would be meaningful
  std::tie(a, b, c, d) = t;

  printf( "%d %d %d %d\n", a, b, c, d ) ;
}

shows that gcc does indeed optimize it away:

func(int, int, int, int):
    movl    %ecx, %r8d
    xorl    %eax, %eax
    movl    %edx, %ecx
    movl    %esi, %edx
    movl    %edi, %esi
    movl    $.LC0, %edi
    jmp printf

On the other hand if you used a address of t and printed it out, we now have observable behavior which relies on t existing (see it live):

printf( "%p\n", static_cast<void*>(&t) );

and we can see that gcc no longer optimizes away the t:

movl    %esi, 12(%rsp)
leaq    16(%rsp), %rsi
movd    12(%rsp), %xmm1
movl    %edi, 12(%rsp)
movl    $.LC0, %edi
movd    12(%rsp), %xmm2
movl    %ecx, 12(%rsp)
movd    12(%rsp), %xmm0
movl    %edx, 12(%rsp)
movd    12(%rsp), %xmm3
punpckldq   %xmm2, %xmm1
punpckldq   %xmm3, %xmm0
punpcklqdq  %xmm1, %xmm0

At the end of the day you need to look at what the compiler generates and profile your code, in more complicated cases it may surprise you. Just because the compiler is allowed to do certain optimizations does not mean it will. I have looked at more complicated cases where the compiler does not do what I would expect with std::tuple. godbolt is a very helpful tool here, I can not count how many optimizations assumptions I used to have that were upended by plugging in simple examples into godbolt.

Note, I typically use printf in these examples because iostreams generates a lot of code that gets in the way of the example.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • 1
    Didn't realize you could share godbolt examples. Here's [mine](https://goo.gl/ilaeZg). – Barry May 13 '15 at 11:48
  • 1
    @Barry happy you learned something new, I feel like it is an essential feature, not being able to share examples like that would make it much less useful. This is what is awesome about SO, always learning something new. – Shafik Yaghmour May 13 '15 at 12:27
  • I feel the most important information in your answer is the last paragraph. – bolov May 13 '15 at 12:28
  • 1
    I find the file produced with `-fdump-tree-optimized` more readable than asm for high-level optimizations. – Marc Glisse May 13 '15 at 12:46
  • @MarcGlisse that is fair point, it is not a convenient to share though, although we [can do it via Coliru](http://coliru.stacked-crooked.com/a/20adf4245035da2f). – Shafik Yaghmour May 13 '15 at 12:55
  • @ShafikYaghmour Fascinating! Check out the difference in output if you have change the signature to be `func(std::make_tuple(10,20,50,100));` – Barry May 13 '15 at 13:26
  • @Barry indeed, that was along the lines I what I was alluding to. I had a bunch of conversations on this at cppcon last year but nothing conclusive. – Shafik Yaghmour May 13 '15 at 14:12