5

I have this test program, using a #define constant:

#include <stdio.h>

#define FOO 1

int main()
{
    printf("%d\n", FOO);

    return 0;
}

When compiled with “Apple LLVM version 10.0.0 (clang-1000.11.45.5)”, I get an executable of 8432 bytes. Here is the assembly listing:

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $16, %rsp
    leaq    L_.str(%rip), %rdi
    movl    $1, %esi
    movl    $0, -4(%rbp)
    movb    $0, %al
    callq   _printf
    xorl    %esi, %esi
    movl    %eax, -8(%rbp)          ## 4-byte Spill
    movl    %esi, %eax
    addq    $16, %rsp
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "%d\n"


.subsections_via_symbols

Now I replace #define FOO 1 with const int FOO = 1;. The executable is now 8464 bytes and the assembly listing looks like this:

.section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $16, %rsp
    leaq    L_.str(%rip), %rdi
    movl    $1, %esi
    movl    $0, -4(%rbp)
    movb    $0, %al
    callq   _printf
    xorl    %esi, %esi
    movl    %eax, -8(%rbp)          ## 4-byte Spill
    movl    %esi, %eax
    addq    $16, %rsp
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__const
    .globl  _FOO                    ## @FOO
    .p2align    2
_FOO:
    .long   1                       ## 0x1

    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "%d\n"


.subsections_via_symbols

So it actually declared a FOO variable, making the executable 32 bytes bigger. I get the same result with -O3 optimization level.

Why is that? Normally, the compiler should be intelligent enough to optimize and add the constant to the symbol table instead of taking up storage for it.

JVApen
  • 11,008
  • 5
  • 31
  • 67
GilDev
  • 514
  • 5
  • 14

2 Answers2

15

This is another case where the difference between C and C++ matters.

In C, const int FOO has external linkage and must thus be included in the binary.

Compiling with g++ or clang++ instead gives you the desired optimization as FOO has internal linkage in C++.

You can achieve the optimization in C mode by explicitly requesting internal linkage for FOO via

static const int FOO = 1;

Both clang and gcc with link-time optimization enabled (-flto) also manage to strip away the unused symbol, even when linkage is external. (Live with and without LTO.)

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
3

The fact that you use the variable FOO in your second program means that it has to live somewhere, so the compiler needs to allocate it somewhere.

In the #define case, there is no variable - the pre-processor substituted the text "FOO" with the text "1" an so the call to printf() was passed a constant value, not a variable.

  • 1
    @Deduplicator - not quite sure what you are getting at. "foo" defined in OP's test.c needs to be defined, b/c the compiler has no idea if that test.o will be linked against some source.c which will access "foo". While the compiler was smart enough to replace the variable with a constant in the call to printf (at least that what I think its doing with "movl $1, %esi") - the linker is not analyzing all your code again to determine that a particular variable is unused. – dan.m was user2321368 Jan 18 '19 at 16:25
  • 1
    *"the linker is not analyzing all your code again to determine that a particular variable is unused."* It is if you tell it to. – Baum mit Augen Jan 18 '19 at 16:28
  • "It is if you tell it to" - I just saw that. As that was added to 2009 - way after I moved from the C world - I was unaware. That said, the object file (and the assembly) generated is always going to have that symbol defined b/c that is the input to the linker. – dan.m was user2321368 Jan 18 '19 at 16:34