2

When Initializing variables in C I was wondering does the compiler make it so that when the code is loaded at run time the value is already set OR does it have to make an explicit assembly call to set it to an initial value?

The first one would be slightly more efficient because you're not making a second CPU call just to set the value.

e.g.

void foo() {
  int c = 1234;
}
Archimedes Trajano
  • 35,625
  • 19
  • 175
  • 265
  • 1
    Yes, for static (global) variables, not for auto (stack) variables. But still, it's so fast that it doesn't matter. – Constantine Georgiou Jun 12 '19 at 18:15
  • 1
    It still needs to load the address of the data into the right memory location or copy the data to said memory location, so `mov` instructions are needed anyway, even if the actual data is stored in the executable. – ForceBru Jun 12 '19 at 18:16
  • Are you talking about automatic or static variables? – melpomene Jun 12 '19 at 18:17
  • https://godbolt.org/z/Zf3Hjr; As you can see, there's a `mov` instruction, but the value `1234` is stored right in the executable as an immediate value. – ForceBru Jun 12 '19 at 18:25
  • 1
    Your example [compiles to nothing](https://godbolt.org/z/HgJyxp), i.e. it's just `foo: ret` (if you enable optimizations). – melpomene Jun 12 '19 at 18:26
  • 1
    [Near duplicate.](https://stackoverflow.com/questions/56530702/where-does-initialized-auto-variables-local-variables-placed-in-object-file) – Eric Postpischil Jun 12 '19 at 20:08

4 Answers4

4

A compiler is not required to do either of them. As long as the behavior of the program stays the same it can pretty much do whatever it wants.

Especially when using optimization, crazy stuff can happen. Looking at the assembly code after heavy optimization can be confusing to say the least.

In your example, both the constant 1234 and the variable c would be optimized away since they are not used.

klutt
  • 30,332
  • 17
  • 55
  • 95
3

If it's a variable with static lifetime, it'll typically become part of the executable's static image, which'll get memcpy'ed, along with other statically known data, into the process's allocated memory when the process is started/loaded.

void take_ptr(int*);

void static_lifetime_var(void)
{
    static int c = 1234;
    take_ptr(&c);
}

x86-64 assembly from gcc -Os:

static_lifetime_var:
        mov     edi, OFFSET FLAT:c.1910
        jmp     take_ptr
c.1910:
        .long   1234

If it's unused, it'll typically vanish:

void unused(void)
{
    int c = 1234;
}

x86-64 assembly from gcc -Os:

unused:
        ret

If it is used, it may not be necessary to put it into the function's frame (its local variables on the stack)—it might be possible to directly embed it into an assembly instruction, or "use it as an immediate":

void take_int(int);

void used_as_an_immediate(int d) 
{
  int c = 1234;
  take_int(c*d);
}

x86-64 assembly from gcc -Os:

used_as_an_immediate:
        imul    edi, edi, 1234
        jmp     take_int

If it is used as a true local, it'll need to be loaded into stack-allocated space:

void take_ptr(int*);

void used(int d)
{
  int  c = 1234;
  take_ptr(&c);
}

x86-64 assembly from gcc -Os:

used:
        sub     rsp, 24
        lea     rdi, [rsp+12]
        mov     DWORD PTR [rsp+12], 1234
        call    take_ptr
        add     rsp, 24
        ret

When pondering these things Compiler Explorer along with some basic knowledge of assembly are your friends.

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
0

TL;DR: Your examples declares and initializes an automatic variable. It has to be initialized each time the function is called. So there will be some instruction to do this.


As an adjusted duplicate of my answer to How compile time initialization of variables works internally in c?:

The standard defines no exact way of initialization. It depends on the environment your code is developed and run on.

How variables are initialized depends also on their storage duration. You didn't mention it in the text, your example is an automatic variable. (Which is most probably optimized away, as commenters point out.)

Initialized automatic variables will be written each time their declaration is reached. The compiled program executes some machine code for this to happen.

Static variables are always intialized and only once before the program startup.

Examples from the real world:

Most (if not all) PC systems store the initial values of explicitly (and not zero-) initialized static variables in a special section called data that is loaded by the system's loader to RAM. That way those variables get their values before the program startup. Static variables not explicitly initialized or with zero-like values are placed in a section bss and are filled with zeroes by the startup code before program startup.

Many embedded systems have their program in non-volatile memory that can't be changed. On such systems the startup code copies the initial values of the section data into its allocated space in RAM, producing a similar result. The same startup code zeroes also the section bss.

Note 1: The sections don't have to be named liked this. But it is common.

This startup code might be part of the compiled program, or might not. It depends, see above. But speaking of efficience it doesn't matter which program initializes variables. It just has to be done.

Note 2: There are more kinds of storage duration, please see chapter 6.2.4 of the standard.

As long as the standard is met, a system is free to implement any other kind of initialization, including writing the initial values into their variables step by step.

the busybee
  • 10,755
  • 3
  • 13
  • 30
-8

Firstly, its important to have a common understanding of the word 'compiler', else we can end-up arguing endlessly.

In simple words,

a compiler is a computer program that translates computer code written in one programming language (the source language) into another programming language (the target language). The name compiler is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language, object code, or machine code) to create an executable program.

(Ref: Wikipedia)

With this common understanding, let's now find answer you question: The answer is 'yes, the final code contains explicit assembly call to set it to an initial value' for any kind of variables. It is so because finally the variables are either stored in some memory location, or they live in some CPU register in case the number of variables are so less that the variable can be accommodated into some CPU registers such as your code snippet running on lets say most modern servers (Side note: different systems have different number of registers).

For the variables that are stored in registers, there has to be a mov (or equivalent) kind of instruction to load that initial value into the register. Without such an instruction the assigned register cannot be assigned with the intended value.

For the variables that are stored in memory, depending on the architecture and the compiler efficiency, the init value has to be somehow pushed to the given/assigned address, which takes at least a couple of asm instructions.

Does this answer your question?

Community
  • 1
  • 1
ARD
  • 48
  • 8
  • "Yes" is not an answer to an either/or question. – melpomene Jun 12 '19 at 18:37
  • Who's asking what a compiler is? In any case, a compiler may put a value into a known location and reference that location in the code--no moving or loading beyond loading the program required. – Dave Newton Jun 12 '19 at 18:40
  • @melpomene: thanks for your good remark. I've edited my response based on that. Could you please check? – ARD Jun 12 '19 at 19:01
  • @DaveNewton,.Yes, a common agreement of the word 'compiler' is essential because too often new programmers confuse compilers with some magical software that runs/executes computer programs. If you don't believe, then take up teaching programming for those who are learning their first programming language.As an expert, could you please show examples of all the systems where C is used to show that processor directly dereferences some memory location, without loading the memory address into some register? loading address + dereferencing = 2 instructions (now re-read my answer) – ARD Jun 12 '19 at 19:05
  • @melpomene, could you please elaborate why it is wrong? – ARD Jun 12 '19 at 19:06
  • It is imprecise, in many points wrong, part of it is not related and it is very hard to understand what you mean . Deserves a lots of dv – 0___________ Jun 12 '19 at 20:13
  • @P__J__40, simply making some statement that "it's difficult to understand" and then listing your conclusion saying that it deserves 'lots of dv i.e. down votes" doesn't help anyone other than probably you. If you didn't understand something then please ask a clarification question through comments, – ARD Jun 12 '19 at 20:58
  • 1
    @ARD I already teach programming. The OP is asking a question about compilers; IMO it's clear they already know what one is.You specifically said: "the init value has to be somehow pushed to the given/assigned address [...]". This is not always the case; values can be compiled into an executable directly with zero initialization required. – Dave Newton Jun 12 '19 at 21:10
  • @ARD I.e., there does not *need* to be an explicit assembly call to set the initial value of a variable, even if there often is. *Using* that variable may require additional instructions (although that may be intra-CPU). – Dave Newton Jun 12 '19 at 21:12
  • @newton, please do start providing real world examples or references in support of the the generic points that you make about C , otherwise it's like having some debate on a possible scientific stuff such as "can Earth's orbit be shifted a bit further when the sun goes into red Giant mode? Yes/no, please explain" kind of wonderful question in which you DV me for no substainted reason, and I up vote you for a very right reason – ARD Jun 12 '19 at 21:36