2

I have converted a C++ code to assembly with a high optimization level

#include <iostream>
using namespace std;

int main()
{
    float sum=0;
    for(int i = 0; i < 10; i++)
        sum += 1.0f/float(i+1);
    cout<<sum<<endl;
    return 0;
}

via

g++ -O3 -S main.cpp
g++ -O3 main.cpp && ./a.out

The result is

2.92897

But when I convert it into assembly, I do not realize where this number is located. There should be either a loop or (if unrolled) a final result which is 2.92897. But I cannot find it in the following code:

    .file   "main.cpp"
    .section    .text.startup,"ax",@progbits
    .p2align 4,,15
    .globl  main
    .type   main, @function
main:
.LFB1561:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    movl    $_ZSt4cout, %edi
    movsd   .LC0(%rip), %xmm0
    call    _ZNSo9_M_insertIdEERSoT_
    movq    %rax, %rdi
    call    _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    xorl    %eax, %eax
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc
.LFE1561:
    .size   main, .-main
    .p2align 4,,15
    .type   _GLOBAL__sub_I_main, @function
_GLOBAL__sub_I_main:
.LFB2048:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    movl    $_ZStL8__ioinit, %edi
    call    _ZNSt8ios_base4InitC1Ev
    movl    $__dso_handle, %edx
    movl    $_ZStL8__ioinit, %esi
    movl    $_ZNSt8ios_base4InitD1Ev, %edi
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    jmp __cxa_atexit
    .cfi_endproc
.LFE2048:
    .size   _GLOBAL__sub_I_main, .-_GLOBAL__sub_I_main
    .section    .init_array,"aw"
    .align 8
    .quad   _GLOBAL__sub_I_main
    .local  _ZStL8__ioinit
    .comm   _ZStL8__ioinit,1,1
    .section    .rodata.cst8,"aM",@progbits,8
    .align 8
.LC0:
    .long   0
    .long   1074228871
    .hidden __dso_handle
    .ident  "GCC: (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0"
    .section    .note.GNU-stack,"",@progbits

I was suspected to .LC0 and 1074228871. But such a conversion via another code gives me 2.11612 which is a different number.

So, where is the calculation or the result in the assembly code?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
ar2015
  • 5,558
  • 8
  • 53
  • 110

1 Answers1

5

The loop wasn't just unrolled, it was optimized away completely by constant-propagation. That's why main has no branching other than call.

movsd .LC0(%rip), %xmm0 (MOV Scalar Double) loads the 8-byte FP arg to cout<<sum from a static constant in .rodata, like normal for how most compilers deal with FP constants.

At .LC0, we find:

.LC0:
    .long   0
    .long   1074228871

These pseudo-instructions assemble to 8 bytes of data. This is the integer representation of the bit pattern that means 2.92897... in IEE754 double-precision (binary64). x86 is little-endian for FP as well as integer, so the 0 in the first (low) 4 bytes are the bottom of the significand (aka mantissa).

There's an interactive single-precision converter at https://www.h-schmidt.net/FloatConverter/IEEE754.html, but IDK of one for double where you could plug in the integer value of the bit-pattern and see it decoded as a double.

But such a conversion via another code gives me 2.11612 which is a different number.

You linked to code which type-puns the upper half of the bit-pattern to float (violating C++ pointer-aliasing rules, BTW. Use memcpy for type-punning). You'd get the right answer if you took 1074228871ULL << 32 and type-punned that to double.


clang puts asm comments on FP constants to show their value in decimal, but gcc doesn't. e.g. from the Godbolt compiler explorer: clang5.0 -O3 optimizes the loop away to the same constant, but represents it slightly differently in asm:

.LCPI0_0:
    .quad   4613777869364002816     # double 2.9289684295654297
    # exactly equivalent to what gcc emits,
    # just different syntax for the same 8 bytes

It's just bytes, and decimal integer is what gcc always does for all constants in compiler-generated asm, even though this is near useless for humans (much worse even than hex).

I'm not sure if GAS syntax even handles FP constants; NASM does. But as I said, it's all just bytes.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • @ar2015: pointer-casting is (at least in theory) unsafe even with gcc. Use `memcpy(&d, &i, 8);` to copy a bit-pattern from a `uint64_t` to a `double`. That still optimizes away to nothing. Or (with ISO C99 or GNU C++) use a union. That kind of pointer-casting is definitely considered "wrong"; if it works it's just a "happens to work" situation. – Peter Cordes Feb 18 '18 at 07:54
  • Seems even the `union` is not safe according to section 9.5 from [standard](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=252) book. Also mentioned [here](https://stackoverflow.com/questions/11373203/acc) that : "In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time." – ar2015 Feb 18 '18 at 08:00
  • @ar2015: that's correct, `union` type punning in C++ is only safe as a GNU extension (gcc/clang/icc), which is why I said "GNU C++". Other implementations like MSVC also support it, I think, but ISO C++ doesn't. In C, it's supported in ISO C99. (And as a GNU extension, in C89). https://stackoverflow.com/questions/25664848/unions-and-type-punning – Peter Cordes Feb 18 '18 at 08:13
  • `union`s are so beautiful and amazing. Any reason `ISO C++` dropped its support? – ar2015 Feb 18 '18 at 08:16
  • 1
    @ar2015: They're weird with respect to constructor / destructor semantics. But for "simple" types with default constructors/destructors, IDK why writing one and reading another is undefined. But it's not like they "dropped" support; C++ never supported union type-punning. C++ "forked" from C well before C99 standardized union type-punning. (And other uses of `union` in C++ are of course still legal; you can overwrite a different type, the only thing that's UB is read not matching the previous write). – Peter Cordes Feb 18 '18 at 08:35
  • 1
    @ar2015: the standard often leaves things undefined with the intention that if implementations *want* to define it, they can. e.g. signed-integer overflow. IDK if the ISO standard maintainers agree with compiler devs that most things should be left UB and treated as optimization opportunities even on target architectures that could easily define the behaviour. Related: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html – Peter Cordes Feb 18 '18 at 08:37
  • 2
    @ar2015: IMO, C++ should add support for `reinterpret_cast(intger_variable)`, i.e. support `reinterpret_cast` for non-pointer types. That would be a clean way to express type-punning, with exactly the right semantic meaning for human readers, plus compact and clean syntax. Using a union means you have to declare extra variables. And compilers could warn if the type-sizes don't match, because you'd only ever use this for type-punning. In fact, the ISO standard could declare that it's *only* valid if the `` and the `(arg)` have the same `sizeof`. – Peter Cordes Feb 18 '18 at 08:44