42

In C++ I know static and global objects are constructed before the main function. But as you know, in C, there is no such kind initialization procedure before main.

For example, in my code:

int global_int1 = 5;
int global_int2;
static int static_int1 = 4;
static int static_int2;
  • When are these four variables initialized?
  • Where values for initialization like 5 and 4 are stored during compilation? How to manage them when initialization?

EDIT:
Clarification of 2nd question.

  • In my code I use 5 to initialize global_int1, so how can the compiler assign 5 to global_int? For example, maybe the compiler first store the 5 value at somewhere (i.e. a table), and get this value when initialization begins.
  • As to "How to manage them when initialization?", it is realy vague and I myself does not how to interpret yet. Sometimes, it is not easy to explain a question. Overlook it since I have not mastered the question fully yet.
Zachary
  • 1,633
  • 2
  • 22
  • 34
  • All four of your variables have static storage class. – Kerrek SB Jul 22 '13 at 08:48
  • @KerrekSB How does `static storage class` relate to my question? – Zachary Jul 22 '13 at 08:49
  • The storage class determines the initialization behaviour. – Kerrek SB Jul 22 '13 at 08:54
  • Similar interesting discussion here http://stackoverflow.com/questions/898432/how-is-static-variable-initialization-implemented-by-the-compiler?rq=1 – Anand Rathi Jul 22 '13 at 08:56
  • 2
    I know that there *is* an initialisation procedure before `maìn` in C, since the C specification says there is (see C99 5.1.2/1). – Mike Seymour Jul 22 '13 at 09:04
  • Could you clarify what you mean by "How to manage them when initialization"? As a programmer, there's nothing you have to do beyond defining them, and providing an initialiser if needed. – Mike Seymour Jul 22 '13 at 09:20
  • @MikeSeymour The C99 5.1.2/1, indeed says *All objects with static storage duration shall be initialized (set to their initial values) before program startup. The manner and timing of such initialization are otherwise unspecified.* But unlike C++ which really defines a function named `_start` under `gcc C` – Zachary Jul 22 '13 at 09:33
  • @KerrekSB How static storage class determine initialization process? – Zachary Jul 22 '13 at 12:43
  • @MikeSeymour C only allows what C++ calls static initialization. C++ does allow dynamic initialization to occur after entering `main`, officially, but all existing compilers do initialize static variables before `main`, and there is a fairly large body of existing code which depends on this. – James Kanze Jul 22 '13 at 13:08

4 Answers4

28

By static and global objects, I presume you mean objects with static lifetime defined at namespace scope. When such objects are defined with local scope, the rules are slightly different.

Formally, C++ initializes such variables in three phases: 1. Zero initialization 2. Static initialization 3. Dynamic initialization The language also distinguishes between variables which require dynamic initialization, and those which require static initialization: all static objects (objects with static lifetime) are first zero initialized, then objects with static initialization are initialized, and then dynamic initialization occurs.

As a simple first approximation, dynamic initialization means that some code must be executed; typically, static initialization doesn't. Thus:

extern int f();

int g1 = 42;    //  static initialization
int g2 = f();   //  dynamic initialization

Another approximization would be that static initialization is what C supports (for variables with static lifetime), dynamic everything else.

How the compiler does this depends, of course, on the initialization, but on disk based systems, where the executable is loaded into memory from disk, the values for static initialization are part of the image on disk, and loaded directly by the system from the disk. On a classical Unix system, global variables would be divided into three "segments":

text:
The code, loaded into a write protected area. Static variables with `const` types would also be placed here.
data:
Static variables with static initializers.
bss:
Static variables with no-initializer (C and C++) or with dynamic initialization (C++). The executable contains no image for this segment, and the system simply sets it all to `0` before starting your code.

I suspect that a lot of modern systems still use something similar.

EDIT:

One additional remark: the above refers to C++03. For existing programs, C++11 probably doesn't change anything, but it does add constexpr (which means that some user defined functions can still be static initialization) and thread local variables, which opens up a whole new can of worms.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Thx James. Your reply is really wonderful. In the last paragraph, you mentioned dynamic initialization is located in .bss segment. Why? Intuitively, I think it should be in .data segment as it is initialized. – Zachary Jul 22 '13 at 11:44
  • To bother you again. As you know, from you answer, I have better understanding. But I don't know the principle behind. How were these language rules derived. Could you give some reference for me? Here is a link [Storage class specifiers](http://en.cppreference.com/w/cpp/language/storage_duration). After reading this page, I was more confused. Two many small items should be considered. – Zachary Jul 22 '13 at 11:48
  • @Zack Dynamic initialization isn't done until the program starts to execute. As far as loading the executable is concerned, it's zero initialized, and that's all. – James Kanze Jul 22 '13 at 11:56
  • @Zack The rules are somewhat historically conditionned. C didn't have dynamic initialization, and divided the two initializations in order to save disk space: it was the OS which separated the program into three segments, and filled those without a disk image with 0's. C declared that all static variables without initializers would be 0 initialized, because that's what the system did. C++ removed the constraint that initializers had to be static, but when they were, continued to behave like C. – James Kanze Jul 22 '13 at 12:03
  • 1
    James, you mentioned "all static objects (objects with static lifetime) are first zero initialized, then objects with static initialization are initialized" Do you mean a static lifetime object will be initialized two times? If it was an object, then constructor will be called two times. But in real case, this is not true. – Zachary Jul 22 '13 at 12:59
  • 1
    @Zack In theory, yes. But since no user written code is ever involved in static initialization, there's no way you can detect it, and in practice, all implementations skip it. It is possible to detect the double initialization if the initialization isn't static, though. If you have something like `int i = f();`, you may see `i` as `0` in some constructors of static objects, and as the return value of `f()` in others. – James Kanze Jul 22 '13 at 13:02
19

Preface: The word "static" has a vast number of different meanings in C++. Don't get confused.

All your objects have static storage duration. That is because they are neither automatic nor dynamic. (Nor thread-local, though thread-local is a bit like static.)

In C++, Static objects are initialized in two phases: static initialization, and dynamic initialization.

  • Dynamic initialization requires actual code to execute, so this happens for objects that start with a constructor call, or where the initializer is an expression that can only be evaluated at runtime.

  • Static initialization is when the initializer is known statically and no constructor needs to run. (Static initialization is either zero-initialization or constant-initialization.) This is the case for your int variables with constant initializer, and you are guaranteed that those are indeed initialized in the static phase.

  • (Static-storage variables with dynamic initialization are also zero-initialzed statically before anything else happens.)

The crucial point is that the static initialization phase doens't "run" at all. The data is there right from the start. That means that there is no "ordering" or any other such dynamic property that concerns static initialization. The initial values are hard-coded into your program binary, if you will.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Static objects has constructor, when does it call the constructor(Dynamic initialization)? I test it before the main function, Are there relevant standards and information? – daohu527 Oct 26 '20 at 01:13
5

When are these four variables initialized?

As you say, this happens before program startup, i.e. before main begins. C does not specify it further; in C++, these happen during the static initialisation phase before objects with more complicated constructors or initialisers.

Where values for initialization like 5 and 4 are stored during compilation?

Typically, the non-zero values are stored in a data segment in the program file, while the zero values are in a bss segment which just reserves enough memory for the variables. When the program starts, the data segment is loaded into memory and the bss segment is set to zero. (Of course, the language standard doesn't specify this, so a compiler could do something else, like generate code to initialise each variables before running main).

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • Many thx. Especially for the 2nd question. A variable is actually a addressed memory unit. So 5 is just stored at the memory unit allocated to it in .data section/segment. – Zachary Jul 22 '13 at 09:21
  • I have explained it in my post for the 2nd question. Mike, do you think the initialization is done during linkage? Since after linkage process, the segments of program is basically fixed! – Zachary Jul 22 '13 at 09:28
  • `(Of course, the language standard doesn't specify this, so a compiler could do something else, like generate code to initialise each variables before running main)` I just want to understand this problem, is it already executed when the program is loaded? – daohu527 Oct 26 '20 at 01:14
3

Paraphrased from the standard:

All variables which do not have dynamic storage duration, do not have thread local storage duration, and are not local, have static storage duration. In other words, all globals have static storage duration.

Static objects with dynamic initialization are not necessarily created before the first statement in the main function. It is implementation defined as to whether these objects are created before the first statement in main, or before the first use of any function or variable defined in the same translation unit as the static variable to be initialized.

So, in your code, global_int1 and static_int1 are definitely initialized before the first statement in main because they are statically initialized. However, global_int2 and static_int2 are dynamically initialized, so their initialization is implementation defined according to the rule I mentioned above.

As for your second point, I'm not sure I understand what you mean. Could you clarify?

scl
  • 452
  • 4
  • 2
  • *"It is implementation defined as to whether these objects are created before the first statement in main, or before the first use of any function or variable defined in the same translation unit as the static variable to be initialized."* - Anyhow, it is always safe to declare `std::string s = "abc";` in some translation unit, and later access it from a different TU (or the same one). The object is guaranteed to have been initialized by the time it is accessed. Correct? – Aviv Cohn Jul 29 '20 at 12:35