1

When I am declaring some variable outside main then compile stores them in some peculiar way.

int i=1,j=1;
void main(void)
{
     printf("%d\n%d",&i,&j);
}

If both i and j are not initialized or equals 0 or equals some positive values then they are stored at continuous address spaces in memory whereas if i=0 and j = some +ve integer then their addresses are separated by fairly large distance.

The problem with is when they are stored on contiguous address spaces it causes some real performance issues like false sharing (have a look here). I've learned that to prevent this, there should be some space between variable's addresses which is automatically provided when i=0 and j=any +ve value.

Now, what I want to understand is:

  • Why the compiler stores variables to noncontinuous addresses only when one initialized to 0 and other initialized to positive values, and
  • How can I intentionally do what compiler is doing automatically i.e allocating variables to fairly separated address space.

(Using devcpp gcc 4.9.2)

Community
  • 1
  • 1
  • 6
    Don't print addresses using `%d`. – Mohit Jain Mar 08 '17 at 07:09
  • 3
    Use [int main(void)](http://stackoverflow.com/questions/9356510/int-main-vs-void-main-in-c). – Mohit Jain Mar 08 '17 at 07:17
  • 1
    If you want the variables to be in a known relationship, I think you'll need to ensure that they're part of a structure or an array. I'm not sure how far apart they need to be for your purposes, but if it was 4 KiB (a page size), then you could play games with `struct Spacer { int i; char space[4096 - sizeof(int)]; int j; };` You'd have to revise the references to the variables as well, of course. But this gives you control over the layout of the two variables. – Jonathan Leffler Mar 08 '17 at 07:23
  • [This](http://electronics.stackexchange.com/questions/237740/what-resides-in-the-different-memory-types-of-a-microcontroller/237759#237759) applies universally to pretty much all computers. – Lundin Mar 08 '17 at 07:51

2 Answers2

3

Assuming you meant printf("%p, %p\n",(void *)&i,(void *)&j);, note the following:

  1. It is not mandated by C specs to allocate variables in contiguous memory.
  2. Often globals initialized with 0 are kept in BSS section (which is a part of data section) to save binary size. Other globals are kept in rest of the data section. (Depends on implementation detail, not mandated by C specs)

How can I intentionally do what compiler is doing automatically?

This is compiler specific question and your compiler documentation should possibly contain an answer to this.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Mohit Jain
  • 30,259
  • 8
  • 73
  • 100
  • 1
    Sir, _Often globals initialized with 0 are kept in BSS section_ is that correct? I thought unitialized variables go to BSS so that initialization can be done easily by zeroing the whole segment.... – Sourav Ghosh Mar 08 '17 at 07:20
  • Hi @SouravGhosh, *uninitialized globals* and *globals initialized with 0* may be seen with same eye by the compiler. Nevertheless I would confirm this once. – Mohit Jain Mar 08 '17 at 07:22
  • globals are always initialized, to 0, if no other value is specified – Antti Haapala -- Слава Україні Mar 08 '17 at 07:24
  • @SouravGhosh Not the best reference but [this wiki article](https://en.wikipedia.org/wiki/.bss) says, *"[...]An implementation may also assign statically-allocated variables and constants initialized with a value consisting solely of zero-valued bits to the BSS section."* – Mohit Jain Mar 08 '17 at 07:26
  • @AnttiHaapala Right, because of the property with variable having static storage duration, agree, but I was wondering whether explicit and implicit makes any differrence in this case. – Sourav Ghosh Mar 08 '17 at 07:27
  • 1
    @MohitJain OK, it makes sense. Only when the init value != 0, then it moves to the `.data`. Fair enough. :) – Sourav Ghosh Mar 08 '17 at 07:30
  • yea, it shouldn't, in a linked program. The initialized/uninitialized would only matter when the object has external linkage, then a declaration without initialization is program-wide tentative definition and some other translation unit might initialize it with non-zero value. – Antti Haapala -- Слава Україні Mar 08 '17 at 07:34
  • 1
    @MohitJain I have added that in my answer now, maybe I was just overthinking. :) – Sourav Ghosh Mar 08 '17 at 07:40
  • In terms of completeness, your answer looks better. – Mohit Jain Mar 08 '17 at 08:34
  • 1
    gcc changed the behavior of where variables initialized to 0 are stored. In the past they used to be stored in the data segment, many years ago they were changed to be stored in BSS by default. There is a flag called `-fno-zero-initialized-in-bss` to change that. This might seem like a very esoteric piece of trivia and the flag seems completely unnecessary but this actually broke some things that patched binaries after compilation (like for example certain BSD installation kernels that used a massive empty array as a built in ramdisk that was later populated with installation scripts). – Art Mar 08 '17 at 09:41
  • @Art Thanks for adding this info :) – Mohit Jain Mar 08 '17 at 10:06
2

One problem there,

  printf("%d\n%d",&i,&j);

invokes undefined behavior. So, the outputs cannot be justified in any way. You need to use %p format specifier and cast the corresponding argument to (void *) to print a pointer.

That said, C standard does neither impose any constraints nor provide any guideline on where and how the variables will be stored in memory. It's up to the compiler implementation to decide how to place different variables in memory. You need to check the documentation of the compiler in use to find out the rules your compiler is following.

To elaborate in a generic way, an object file consists of many segments, like

  • Header (descriptive and control information)
  • Code segment ("text segment", executable code)
  • Data segment (initialized static variables)
  • Read-only data segment (rodata, initialized static constants)
  • BSS segment (uninitialized static data, both variables and constants)
  • External definitions and references for linking
  • Relocation information
  • Dynamic linking information
  • Debugging information

and it's up to the compiler to decide the address space (range/value) to be used for each segment.

As per the rules,

  • Global variables (i.e., having static storage duration) left uninitialized and initialized with 0 are placed in .bss segment.
  • Variables initialized with a non-zero value are placed in the .data segment

so, it's fair enough to say that the addresses of two variables pertaining to two different segments will not be contiguous.

Now, your observation checks out.

If both i and j are not initialized or equals 0 or equals some positive values then they are stored at continuous address spaces in memory

yes, then all of them go to either .bss or .data and compiler choose to place them one after another, usually.

whereas if i=0 and j = some +ve integer then their addresses are separated by fairly large distance.

This also holds true, both the variables are now placed in different segments.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261