24

I'm curious about the underlying implementation of static variables within a function.

If I declare a static variable of a fundamental type (char, int, double, etc.), and give it an initial value, I imagine that the compiler simply sets the value of that variable at the very beginning of the program before main() is called:

void SomeFunction();

int main(int argCount, char ** argList)
{
    // at this point, the memory reserved for 'answer'
    // already contains the value of 42
    SomeFunction();
}

void SomeFunction()
{
    static int answer = 42;
}

However, if the static variable is an instance of a class:

class MyClass
{
    //...
};

void SomeFunction();

int main(int argCount, char ** argList)
{
    SomeFunction();
}

void SomeFunction()
{
    static MyClass myVar;
}

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

static bool initialized = 0;
if (!initialized)
{
    // construct myVar
    initialized = 1;
}
e.James
  • 116,942
  • 41
  • 177
  • 214

6 Answers6

11

This question covered similar ground, but thread safety wasn't mentioned. For what it's worth, C++0x will make function static initialisation thread safe.

(see the C++0x FCD, 6.7/4 on function statics: "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.")

One other thing that hasn't been mentioned is that function statics are destructed in reverse order of their construction, so the compiler maintains a list of destructors to call on shutdown (this may or may not be the same list that atexit uses).

Community
  • 1
  • 1
James Hopkin
  • 13,797
  • 1
  • 42
  • 71
  • 2
    Can you give a reference/citation to its being thread safe in C++0x? I haven't found one. – ChrisW Jun 05 '10 at 08:06
9

In the compiler output I have seen, function local static variables are initialized exactly as you imagine.

(Caveat: This paragraph applies to C++ versions older than C++11. See the comments for changes since C++11.) Note that in general this is not done in a thread-safe manner. So if you have functions with static locals like that that might be called from multiple threads, you should take this into account. Calling the function once in the main thread before any others are called will usually do the trick.

I should add that if the initialization of the local static is by a simple constant like in your example, the compiler doesn't need to go through these gyrations - it can just initialize the variable in the image or before main() like a regular static initialization (because your program wouldn't be able to tell the difference). But if you initialize it with a function's return value, then the compiler pretty much has to test a flag indicating if the initialization has been done or something equivalent.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 2
    Did something change? I've heard that after c++11 each initialization of static is thread-safe. – VP. Jan 16 '16 at 08:17
  • @VictorPolevoy: Yes - when this answer was written, C++11 didn't exist. In C++11, the standard included thread support, and this was added to the description of initialization of block-scope static variables (6.7/4): "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization". – Michael Burr Jan 16 '16 at 08:32
  • 4
    I suggest you to edit your answer since it is accepted as answer and has 10 votes. Also this question is most viewable with the topic of initialization of static variables. – VP. Jan 16 '16 at 10:07
  • Thanks for the part about initialization by a simple constant versus a function's return value! Good point! – Daniel Goldfarb May 07 '17 at 17:10
  • @VictorPolevoy: gcc/clang use an acquire load of a guard variable to check that static locals have been initialized, if they don't have compile-time-constant initializers. e.g. https://godbolt.org/z/do89eqdMP If the guard variable is false, then they pick one thread to do the initializing, and have other threads wait for it if they also see a false guard variable. They've been doing this for a long time, since before C++11 required it. (e.g. as old as GCC4.1 on Godbolt, from May 2006.) – Peter Cordes Mar 04 '22 at 02:57
2

You're right about everything, including the initialized flag as a common implementation. This is basically why initialization of static locals is not thread-safe, and why pthread_once exists.

One slight caveat: the compiler must emit code which "behaves as if" the static local variable is constructed the first time it is used. Since integer initialization has no side effects (and calls no user code), it's up to the compiler when it initializes the int. User code cannot "legitimately" find out what it does.

Obviously you can look at the assembly code, or provoke undefined behaviour and make deductions from what actually happens. But the C++ standard doesn't count that as valid grounds to claim that the behaviour is not "as if" it did what the spec says.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
1

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

Yes, that's right: and, FWIW, it's not necessarily thread-safe (if the function is called "for the first time" by two threads simultaneously).

For that reason you might prefer to define the variable at global scope (although maybe in a class or namespace, or static without external linkage) instead of inside a function, so that it's initialized before the program starts without any run-time "if".

ChrisW
  • 54,973
  • 13
  • 116
  • 224
1

Another twist is in embedded code, where the run-before-main() code (cinit/whatever) may copy pre-initialized data (both statics and non-statics) into ram from a const data segment, perhaps residing in ROM. This is useful where the code may not be running from some sort of backing store (disk) where it can be re-loaded from. Again, this doesn't violate the requirements of the language, since this is done before main().

Slight tangent: While I've not seen it done much (outside of Emacs), a program or compiler could basically run your code in a process and instantiate/initialize objects, then freeze and dump the process. Emacs does something similar to this to load up large amounts of elisp (i.e. chew on it), then dump the running state as the working executable, to avoid the cost of parsing on each invocation.

jesup
  • 6,765
  • 27
  • 32
0

The relevant thing isn't being a class type or not, it's compile-time evaluation of the initializer (at the current optimization level). And of course the constructor not having any side-effects, if it's non-trivial.

If it's not possible to simply put a constant value in .data, gcc/clang use an acquire load of a guard variable to check that static locals have been initialized. If the guard variable is false, then they pick one thread to do the initializing, and have other threads wait for it if they also see a false guard variable. They've been doing this for a long time, since before C++11 required it. (e.g. as old as GCC4.1 on Godbolt, from May 2006.)

The most simple artificial example, snapshotting the arg from the first call and ignoring later args:

int foo(int a){
    static int x = a;
    return x;
}

Compiles for x86-64 with GCC11.3 -O3 (Godbolt), with the exact same asm generated for -std=gnu++03 mode. GCC4.1 also makes about the same asm, but doesn't keep the push/pop off the fast path (i.e. missing shrink-wrap optimization). GCC4.1 only supported AT&T syntax output, so it visually looks different unless you flip modern GCC to AT&T mode as well, but this is Intel syntax (destination on the left).

# demangled asm from g++ -O3
foo(int):
        movzx   eax, BYTE PTR guard variable for foo(int)::x[rip]  # guard.load(acquire)
        test    al, al
        je      .L13
        mov     eax, DWORD PTR foo(int)::x[rip]    # normal load of the static local
        ret              # fast path through the function is the already-initialized case


.L13:            # jumps here on guard == 0, on the first call (and any that race with it)
                 # It would be sensible for GCC to put this code in .text.cold
        push    rbx
        mov     ebx, edi             # save function arg in a call-preserved reg
        mov     edi, OFFSET FLAT:guard variable for foo(int)::x  # address
        call    __cxa_guard_acquire          # guard_acquire(&guard_x) presumably a normal mutex or spinlock
        test    eax, eax 
        jne     .L14                         # if (we won the race to do the init work) goto .L14
        mov     eax, DWORD PTR foo(int)::x[rip]  # else it's done now by another thread
        pop     rbx
        ret
.L14:
        mov     edi, OFFSET FLAT:guard variable for foo(int)::x
        mov     DWORD PTR foo(int)::x[rip], ebx       # init static x (from a saved in RBX)
        call    __cxa_guard_release
        mov     eax, DWORD PTR foo(int)::x[rip]       # missed optimization:  mov eax, ebx  
                # This thread is the one that just initialized it, our function arg is the value. 
                # It's not atomic (or volatile), so another thread can't have set it, too.
        pop     rbx
        ret

If compiling for AArch64, the load of the guard variable is ldarb w8, [x8], a load with acquire semantics. Other ISAs might need a plain load and then a barrier to give at least LoadLoad ordering, to make sure they load the payload x no earlier than when they saw the guard variable being non-zero.


If the static variable has a constant initializer, no guard is needed

int bar(int a){
    static int x = 1;
    return ++x + a;
}
bar(int):
        mov     eax, DWORD PTR bar(int)::x[rip]
        add     eax, 1
        mov     DWORD PTR bar(int)::x[rip], eax   # store the updated value
        add     eax, edi                          # and add it to the function arg
        ret

.section .data

bar(int)::x:
        .long   1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847