3

Here is a simple C code

#include <stdio.h>

int a = 5;

static int b = 20;

int main(){

 int c = 30;

 return 0;
}

Compiled to assebly with no optimization:

    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 13
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    xorl    %eax, %eax
    movl    $0, -4(%rbp)
    movl    $30, -8(%rbp)
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __DATA,__data
    .globl  _a                      ## @a
    .p2align    2
_a:
    .long   5                       ## 0x5



My question is where is static int b = 20; in the above assembly? I know they are supposed to be in the global section of the memory but I cannot find it in the compiled version.

Joe
  • 31
  • 2
  • 7
    Possibly nowhere? The compiler might have deleted it because you weren't using it. – user253751 Feb 12 '19 at 03:38
  • @immibis i was just thinking that too. Even with optimization turned off i thought the compiler wouldnt do that but i dont see it so.... joe make them volatile and compare the assembly – Bwebb Feb 12 '19 at 03:39
  • i have compiled it with no optimization enabled – Joe Feb 12 '19 at 03:39
  • 4
    "No optimization enabled" doesn't mean "make no transformations to my code." Just `return b` from `main` and you'll see it show up. – Carl Norum Feb 12 '19 at 03:40
  • @CarlNorumin fact it did, thank you, i thought no optimization means don't touch the code and compile as is – Joe Feb 12 '19 at 03:42
  • @Joe even at `-O0` there are some transformations done like [constant divisions by power-of-two value](https://stackoverflow.com/q/2580680/995714) – phuclv Feb 12 '19 at 03:46
  • 2
    If the compiler didn't touch the code it would output C code instead of assembly and it wouldn't be any more useful than `cat`. Anyway, apparently even `-O0` does *some* optimizations that are really simple and obvious for the compiler to do. – user253751 Feb 12 '19 at 04:00
  • 1
    @Joe What would be the assembly code that corresponds to compiling as is the number 20, not used in any way? – David Schwartz Feb 12 '19 at 04:16
  • 1
    @DavidSchwartz: It has `_a: .long 5` for the unused global variable. The OP is looking at the full assembly-language output of the compiler, not *just* the machine code mnemonics + operands. Data has to be assembled into (usually other sections of) the output file as well, if present in the asm source. – Peter Cordes Feb 12 '19 at 04:43

4 Answers4

7

Your code doesn't use b, and it's file-scoped so nothing in other files can use it. GCC doesn't bother to emit a definition for it.

To answer the title question:
A non-const static / global variable (i.e. static storage class) variable with a non-zero initializer will go in .section .data, as opposed to .bss (zero-init mutable), or .rdata (Windows) / .rodata (Linux) for non-zero read-only data.


gcc doesn't have a fully braindead mode that transliterates to asm naively. See Disable all optimization options in GCC - GCC always has to transform through its internal representations.

GCC always does a pass that leaves out unused stuff even at -O0. There might be a way to disable that, unlike some of the other transformations gcc does even at -O0.

gcc and clang -O0 compile each statement to a separate block of asm that stores/reloads everything (for consistent debugging), but within that block gcc still applies its standard transformations, like (x+y) < x becoming y<0 for signed x and y with gcc8 and newer, or x / 10 into a multiply + shift of the high half. (Why does GCC use multiplication by a strange number in implementing integer division?).

And code inside if(false) is removed by gcc even at -O0, so you can't jump to it in GDB.

Some people care about runtime performance of debug builds, especially developers of real-time software like games or operating systems that's not properly testable if it runs too slowly. (Human interaction in games, or maybe device drivers in OSes.)


Some other compilers are more braindead at -O0, so you do often see asm that looks even more like the source expressions. I think I've seen MSVC without optimization emit instructions that did mov-immediate into a register, then cmp reg,imm, i.e. do a branch at runtime that only depends on immediate, and thus could trivially have been computed at compile time within that expression.

And of course there are truly non-optimizing compilers whose entire goal is just to transliterate with fixed patterns. For example, the Tiny C Compiler I think is pretty much one-pass, and emits asm (or machine code) as it goes along. See Tiny C Compiler's generated code emits extra (unnecessary?) NOPs and JMPs shows just how simplistic it is: it always emits a sub esp, imm32 in function prologues, and only comes back to fill in the immediate at the end of the function once it knows how much stack the function needs. Even if the answer is zero, it can't remove it and tighten up the code.


It's usually more interesting to look at optimized asm anyway. Write functions that take args and return a value, so you can see the interesting part of the asm without a lot of boilerplate and store/reload noise. How to remove "noise" from GCC/clang assembly output?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Technically, MSVC's `/O0` is the same as GCC and Clang. It's designed to optimize code for debugging, so you can do things like set breakpoints on individual lines of C code. It doesn't elide branches, but it does do basic compile-time arithmetic folding. See, for example, [this code](https://gcc.godbolt.org/z/B0poJA). It knows that the conditional always evaluates to true, so it doesn't actually do the comparison of the constants, but it does emit code that does a test-and-branch so you can set a breakpoint on the `if`. And it doesn't ever elide "dead" code. – Cody Gray - on strike Feb 13 '19 at 07:03
3

If a static variable hasn't been optimized out by the compiler, it will go in the process' default data section.

In assembly, that can normally be controlled by the programmer in a section of the file designated for describing the data section.

The C Standard says in § 6.2.4 paragraph 3:

An object whose identifier is declared ... with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.

With the following code:

static int a = 100;

int foo()
{
    return (a / 2);
}

Look at how the symbol _a appears in the _DATA segment for MSVC, lines 27-30 for GCC, and lines 28-30 for Clang.

Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
0

The whole question is a bit inaccurate... (rereading it, you are actually very specific about "in the assembly above" ... oh well, then the answer is "nowhere" .. and rest of my answer is for the question which was not posted, but hopefully explaining why "nowhere" is answer for your question).

You have C source, and then you show some assembly as compiler output (but you don't specify compiler) and then you ask about Assembly...

The C is being defined upon "C abstract machine", while you are looking at particular x86-64 implementation of such abstract machine.

While that implementation does have some rules where static variables usually end up, it depends completely on the compiler - how it wants to implement them.

In pure Assembly (like hand-written, or from CPU point of view) there's no such thing as "static value". You have only registers, memory and peripherals.

So in Assembly (machine code) you can use certain register or certain part of memory as static variable. Whichever suits your needs better (there is no hard rule which would force you to do it in any particular way, except you must express your idea within the valid machine code for target CPU, but that usually means there are billions of possibilities and even when constraining yourself to only "reasonable" ones, it's still more toward tens of possible ways than only single).

You can (in x86-64) even create a bit convoluted scheme how to keep the value as code-state ("part of memory" is then the memory occupied by the machine code), i.e. it would be not directly written in memory as a value, but the code would follow certain code paths (from many possible) to obtain correct final result, i.e. encoding the value in the code itself. There's for example Turing-complete way how to compile C source into x86-64 machine code using only mov instruction, which maybe doesn't use memory for static variables (not sure, whether it adds .data section or avoid it by compiling it into mov code too, but from its sheer existence it should be quite obvious how the .data can be theoretically avoided).

So you are either asking how particular C compiler with particular compile time options implements static values (and that may have some variants depending on the source and options used)...

... or if you are really asking about "where are static values stored in assembly", then the answer is "anywhere you wish, as long as your machine code is valid and correct", as the whole "static value" concept is of higher level than CPU operates at, so it's like interpretation of particular machine code purpose "that's the static value", but there's no specific instruction/support in CPU to handle that.

Ped7g
  • 16,236
  • 3
  • 26
  • 63
-3

Static variables are not stored in the memory. They will appear only when used For example

static int b = 20; c = c + b;

will compile

add c, '20'

  • 1
    You're thinking of `static const`. Without `const`, yes this optimization is possible if nothing in the file writes the variable, but that's not true in the general case, and gcc doesn't do it without optimization enabled: https://godbolt.org/z/LR5Uw7. Note the `mov edx, DWORD PTR b[rip]`. Also, `'20'` is a 2-byte character literal, with the ASCII codes for `'2'` and `'0'`. i.e. `0x3032` because x86 is little-endian. – Peter Cordes Feb 12 '19 at 04:08