4
struct X {
  void * a;
  void * b;
};

X foo( void * u, void * v);
  • foo() is implemented in assembler (i386)
  • address of return value of type X is passed as hidden parameter to foo()

  • if test code is compiled with -O0 the code works as expected

  • if compiled with -O3 segmentation fault happens (return value was optimized out)
  • if compiled with -O3 -fno-elide-constructors the code works as expected again

how can the compiler be forced no to add RVO for foo() only (aka not forcing -fno-elide-constructors)?

Update1: the code must work for arbitrary compilers (at least gcc ,clang, msvc), example code:

void * vp = bar();
X x = foo( vp, 0);
x = foo( x.a, 0);
x = foo( x.a, 0);

Update2: the problem is, that the compiler optimizes out the instances of x

X x = foo( vp, 0);
x = foo( x.a, 0);
x = foo( x.a, 0)

or

X x1 = foo( vp, 0);
X x2 = foo( x1.a, 0);
X x3 = foo( x2.a, 0)

doesn't matter. for instance the segfault happens because

X x2 = foo( x1.a, 0);

x1 was optimized out and the implementation tries to access the first argument , which is a null pointer.

xlrg
  • 1,994
  • 1
  • 16
  • 14
  • 3
    There is no general way to do it, you should point compiler that you use, here how to do it with gcc: https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html – fghj Nov 02 '15 at 10:23
  • 6
    If `foo` is implemented in assembly, then `-fno-elide-constructors` shouldn't have any effect for it. `-fno-elide-constructors` affects how C++ is translated to assembly. If you don't have C++ code for it in the first place, there is no assembly generation to modify. Can you edit your question to provide a [mcve] -- that is, your assembly `foo` function and a minimal program that calls that function, that segfaults at `-O3` the way you describe? –  Nov 02 '15 at 11:07
  • @hvd: foo() is too complicated to present here – xlrg Nov 02 '15 at 11:14
  • can you at least tell us why RVO creats seg-fault? seams weird.. – David Haim Nov 02 '15 at 11:25
  • 1
    The problem must be somewhere else. If the compiler isn't the one who also compiles `foo`, and moreover if id doesn't have access to the source of `foo` or the IR of `foo` (lto) at the place of the call, then it just cannot apply the RVO. It is impossible, as it would have to modify the location memory of where `X` is created. Inspect further the code, the problem is not RVO, at least not with the code provided. – bolov Nov 02 '15 at 12:11
  • Moreover, when the compiler doesn't have access to a function (at the point of calling it) it is forced to follow the function's calling convention. – bolov Nov 02 '15 at 12:18
  • OK, bar() creates a stack, foo switches between two stacks. X holds the stackpointer of the prev. suspended stack (+data). For x86_64 it works, i386 causes segfault because returned X are optimozed out. – xlrg Nov 02 '15 at 13:28
  • 1
    "bar() creates a stack, foo switches between two stacks ..." might it be that your program is over-engineered? – Kijewski Nov 02 '15 at 14:32
  • @kay: replacement for ucontext (deprecated POSIX sich andard), useful for async ops – xlrg Nov 02 '15 at 14:51
  • @bolov, David: I don't know why -fno-elide-constructors fixes the problem (note, segfault happens with optimization enabled). inside foo() I access the address of X passed as hidden first arg to foo() == how i386 returns a struct. with enabled optimization the address contains a null pointer. – xlrg Nov 02 '15 at 16:10

1 Answers1

8

You can set the optimization level for a single function in GCC, too:

X foo(void *u, void *v) __attribute__((optimize("no-elide-constructors");

The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line. Arguments can either be numbers or strings. Numbers are assumed to be an optimization level. Strings that begin with O are assumed to be an optimization option, while other options are assumed to be used with a -f prefix. You can also use the ‘#pragma GCC optimize’ pragma to set the optimization options that affect more than one function. See Function Specific Option Pragmas, for details about the ‘#pragma GCC optimize’ pragma.

This can be used for instance to have frequently-executed functions compiled with more aggressive optimization options that produce faster and larger code, while other functions can be compiled with less aggressive options.

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html

You can also try the #pragma variant:

#pragma GCC push_options
#pragma GCC optimize ("no-elide-constructors")

X foo(void *u, void *v);

#pragma GCC pop_options
Community
  • 1
  • 1
Kijewski
  • 25,517
  • 12
  • 101
  • 143
  • the code must be compiler independent - sorry that I missed to note that – xlrg Nov 02 '15 at 10:55
  • Is `X` [*Plain Old Data*](http://stackoverflow.com/q/146452/416224)? Can you tamper with the code of `foo` and make `X` a "proper" argument? – Kijewski Nov 02 '15 at 11:04
  • @xlrg, besides my earlier questions, did the answer work in GCC or was it altogether wrong? – Kijewski Nov 02 '15 at 14:29
  • it works for GCC but Clang.seams not to support the attribute optimize – xlrg Nov 02 '15 at 14:41
  • Please try variant with `#pragma GCC optimize ("no-elide-constructors")` (just added to my answer). – Kijewski Nov 02 '15 at 14:46