2

The BSS section of the static memory layout is [supposed to be] for "Uninitialized global variables" or "Global variables set to 0".

I was running some tests and suddenly noticed that local static variables are also increasing the size of the BSS segment.

Example :-

Before any static variables

int main (int argc, char argv[])
{
    return 0;
}
data/repos/e-c 
❯ size a.out 
   text   data     bss     dec     hex  filename
   1418    544       8    1970     7b2  a.out

After static variables

int main (int argc, char *argv[])
{
    static int a, b, c;
    return 0;
}
data/repos/e-c 
❯ !s
size a.out 
   text   data     bss     dec     hex  filename
   1418    544      16    1978     7ba  a.out

Those variables are certainly not global variables, then why's the BSS segment increasing? Or is the idea of "Segment for uninitialized global variables" not entirely correct?

Currently I'm on Linux, and using the GCC compiler (version 9.3.0).

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
debdutdeb
  • 133
  • 7
  • 3
    https://stackoverflow.com/questions/93039/where-are-static-variables-stored-in-c-and-c may help – Fryz Feb 26 '21 at 17:58
  • 2
    Um, where does it say that BSS is only for globals? Wikipedia says "The BSS segment contains all global variables and static variables that are initialized to zero or do not have explicit initialization in source code. ". – SuperStormer Feb 26 '21 at 17:59
  • 2
    `static` variables have the same storage class as globals. – Eugene Sh. Feb 26 '21 at 17:59
  • 1
    Can you find some other difference between global variables and static local variables, other than their scope, which matters in any way insofar as linking goes? – Sam Varshavchik Feb 26 '21 at 17:59
  • Correct me if I'm wrong, but isn't static just a keyword to declare a global variable in a function? – mediocrevegetable1 Feb 26 '21 at 18:01
  • 2
    @mediocrevegetable1 Nope. It is to declare a static variable in a function :) – Eugene Sh. Feb 26 '21 at 18:01
  • I just checked, and the main difference is that static is local to the file, so that's silly of me. – mediocrevegetable1 Feb 26 '21 at 18:05
  • 3
    @mediocrevegetable1, yes and no. `static` has different meaning for file-scope declarations than it does for block-scope declarations. That the declared identifier has internal linkage is the effect at file scope. That the declared object has static storage duration (which *all* objects declared at file scope have, whether static or extern) is the effect at block scope. – John Bollinger Feb 26 '21 at 18:06
  • 3
    global is a misleading term here. You're not concerned so much about the level of access as you are the [storage duration](https://en.cppreference.com/w/cpp/language/storage_duration). – user4581301 Feb 26 '21 at 18:07
  • @JohnBollinger ah, I see. – mediocrevegetable1 Feb 26 '21 at 18:09
  • There's also the subtlety that globals are initialized when the first function in that translation unit is called, and function static locals are initialized the first time that specific line of code is executed. – Mooing Duck Feb 26 '21 at 18:42
  • Does this answer your question? [Where are static variables stored in C and C++?](https://stackoverflow.com/questions/93039/where-are-static-variables-stored-in-c-and-c) – Antonin GAVREL Feb 26 '21 at 18:43
  • Under the hood, `static` and global variables are usually treated exactly the same for purposes of execution. The only difference is in which parts of the program the compiler and linker will allow them to be accessed by name. Once the program has successfully compiled and linked, the variables no longer have names and the distinction effectively ceases to exist. – Nate Eldredge Feb 26 '21 at 20:57
  • 1
    I've just noticed this is tagged C and C++, but the answers are different for the two languages, so you have to pick one – Mooing Duck Feb 26 '21 at 21:29

1 Answers1

6

The BSS section of the static memory layout is [supposed to be] for "Uninitialized global variables" or "Global variables set to 0".

It's unclear where you got that impression, but it is at best misleading. Most people using the term "global variable" in C context mean an object identifier with external linkage, which is necessarily for an object with static storage duration. With a few provisos, such an identifier can be used anywhere in a program to refer to the same object, hence "global". The existence and nature of some of the provisos make use of the term "global" for these a bit fraught, but I'll leave that for a different answer.

The key point there with respect to BSS is not the linkage but the storage duration. Static storage duration means that, at least in principle, the object comes into existence* at or before the beginning of the program and lives (at least) until the program terminates. Contrast with variables declared at block scope without static: these have automatic storage duration, meaning they come into existence at the point of declaration, and live only until execution of their innermost containing block terminates.

Objects with static storage duration need to be represented in the program image, regardless of their linkage, because they have the same lifetime as the program itself. C specifies that in the event that such objects are not explicitly initialized, their initial values are as if they were initialized to 0 (for numeric types) or to NULL (pointer types) or memberwise to these for compound types. BSS is a space- and time-saving shortcut for representing storage for such objects and for those explicitly intialized to 0.

So-called "global" variables that satisfy the initialization conditions can be and typically are attributed to the BSS, but so are

  • file-scope variables with internal linkage (the effect of static on declarations at that scope; these automatically have static storage duration but are accessible only from one source file, and
  • block-scope variables with static storage duration, as specified by use of the static keyword at that scope, even though these have no linkage.

*In C++, some of these are subject to dynamic initialization at a later time, but memory for such objects is still reserved for the entire run of the program, and they are subject to zero initialization at program startup. That they have memory reserved and well-defined value constitutes existence for the purposes of this answer.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • 1
    "Static storage duration means that, at least in principle, the object comes into existence at or before the beginning of the program", Not quite. static "globals" are initialized before the first function in that TU is called, and static "locals" are initialized the first time that line of code is hit :( – Mooing Duck Feb 26 '21 at 18:44
  • 3
    Not so, @MooingDuck, at least not in C. I refer you to paragraph 6.2.4/3 of the C language specification: "An object whose identifier is declared without the storage-class specifier _Thread_local, and either with external or internal linkage or with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup." – John Bollinger Feb 26 '21 at 20:32
  • @MooingDuck: That could in theory be done under the "as if" rule, but typical implementations don't do it that way. See for instance https://godbolt.org/z/P3YT1z. There is no code executed at runtime to assign the initial values 17 and 99 to the `static` variables `x` and `s`; rather, they're initialized at load time just like the global variable `g`, with their initial values loaded or mapped from the binary. The initialization will happen whether or not `foo()` is ever called (though if it is never called, it'll be irrelevant). – Nate Eldredge Feb 26 '21 at 20:51
  • As for C++, paragraph 6.7.5.2/1 of its specification says, "All variables which do not have dynamic storage duration, do not have thread storage duration, and are not local have static storage duration. **The storage for these entities lasts for the duration of the program**". And paragraph 6.9.3.2/1 says, "[...] **Variables with static storage duration are initialized as a consequence of program initiation**." (Emphasis added.) – John Bollinger Feb 26 '21 at 20:51
  • 1
    @MooingDuck: It's also pretty hard to see how an implementation would do it your way without imposing an awful lot of overhead - with a naive approach, every function in the TU would have to contain code to check (atomically!) if the initialization has already happened and to do it if not, and that code would be executed on every call to each of those functions. – Nate Eldredge Feb 26 '21 at 20:54
  • 2
    @MooingDuck is basically correct for C++, and this is a clear illustration of why many double-tagged questions should not be answered or be answered with a great deal of care. Although it's slightly more complicated than that, and there is a sense in which both JohnBollinger and MooingDuck are correct even though they seem to be saying different things. Static storage duration variables are often initialised twice: once statically, as per 6.9.3.2/1 "as a consequence of program initiation", and once dynamically. as per 8.8/4 "the first time control passes through its declaration".... – rici Feb 26 '21 at 21:36
  • C++ and C differ here, I didn't know that https://godbolt.org/z/Wvv5aE! https://eel.is/c++draft/stmt.dcl#4 Dynamic initialization of a block variable with static storage duration or thread storage duration is performed the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. https://eel.is/c++draft/basic.start#dynamic-6 It is implementation-defined whether the dynamic initialization of a non-block inline variable with static storage duration is sequenced before the first statement of main or is deferred... – Mooing Duck Feb 26 '21 at 21:37
  • 1
    Dynamic initialisation involves actually evaluating the initiializer, which in C++ does not have to be constant and may have side effects. And for local static variables, it is important that these side effects not be performed if the block containing the static declaration is never executed. – rici Feb 26 '21 at 21:38
  • @mooingDuck: Very different. C doesn't allow non-constant declaration of static variables, so all initialization can be treated as constant initialization. This is a constant source of confusion when initialization questions are answered here using "the wrong tag". I try to be very liberal about double-tagged questions but there are cases in which the answers for the two different languages would be contradictory, not just different. – rici Feb 26 '21 at 21:41
  • As far as I can see, none of this conflicts with this answer. Even in C++, objects with static storage duration have storage reserved for them for the entire duration of the program, and they are subject to static initialization as a consequence of program startup. I am satisfied to characterize that as they "come[] into existence". That some are subject to dynamic initialization at a later time, before they are actually used, is an interesting wrinkle, but it does not conflict with this answer as far as I am concerned. YMMV. – John Bollinger Feb 26 '21 at 22:55