An extern C pointer puzzler

Question

You are given the following two C files:

#include <stdint.h>
#include <stdio.h>

extern uint32_t *foo;

int main() {
    printf("%p\n", foo);
    printf("%x\n", *foo);
}

and

#include <stdint.h>
uint32_t foo[2] = {0xDEADBEEF, 0xCAFEFEED};

Assuming you're running on an x86_64 processor, what happens when you compile and link these two files together? More importantly, why?

score 3 · Answer 1 · edited May 23 '17 at 12:09

3

There is separate "stack" site for puzzlers: https://codegolf.stackexchange.com/

In your case you are lying to your compiler. You define 'foo' as name for an array in second file and as a pointer in you 'main' file. array and pointer are different concepts.

If you change extern declaraion in main to be the same as in second module you will be ok: extern uint32_t foo[];

Added: If you "inline" foo and replace extern uint32_t *foo; with uint32_t foo[2] = {0xDEADBEEF, 0xCAFEFEED}; Then compiler will see that your variable is not a pointer but rather a name for an array. Exactly like when you do extern unit32_t foo[]. Check, for example, here: Is an array name a pointer?.

edited May 23 '17 at 12:09

Community

1
1

answered Jan 26 '15 at 23:30

fukanchik

2,811
24
29

2

The follow up is: why does the program run fine if you inline the extern declaration? – Edward Z. Yang Jan 26 '15 at 23:32
The puzzle site is http://puzzling.stackexchange.com, but I agree that the question is off-topic because it doesn't clearly describe the expected behavior versus what actually happens. – user3386109 Jan 26 '15 at 23:37
What do you mean by "inline extern declaration"? – fukanchik Jan 26 '15 at 23:38
Replace extern uint32_t *foo; with uint32_t foo[2] = {0xDEADBEEF, 0xCAFEFEED}; – Edward Z. Yang Jan 26 '15 at 23:40
Then compiler will see that your variable is not a pointer but rather a name for an array. Exactly like when you do extern unit32_t foo[]. Check, for example, here: http://stackoverflow.com/questions/1641957/is-array-name-a-pointer-in-c – fukanchik Jan 26 '15 at 23:44
ring0: Oh, don't link in the auxiliary file. – Edward Z. Yang Jan 26 '15 at 23:46
fukanchik: You know what is going on! But can you see why this explanation does not seem very clear? (e.g. Why doesn't the compiler know to decay the pointer when linking the two files together? Why isn't decaying the pointer a no-op?) – Edward Z. Yang Jan 26 '15 at 23:49

score 2 · Answer 2 · answered Jan 26 '15 at 23:36

2

You will get

DEADBEEFCAFEFEED
Segmentation fault

?

Because C arrays are stored directly, there is no such thing as intermediate reference or pointer to them. I think you'd expect that to happen:

C compiler puts two DEAD numbers somethere (consumes 8 bytes).
C compiler creates a pointer to that memory area and calls it foo (consumes additional 8 bytes).
Linker uses that pointer foo later in main file

Step 2 is easy to expect because in many cases arrays act like they are pointers, i.e.:

int a[2] = {1, 3};
...
*a

But they are not pointers, its just C compiler knows what you mean by saying *a. You can check it by taking reference:

int a[2] = {1, 2};  
printf("%p %p\n", a, &a);  /* Prints same values */

So here are whats really happens:

C compiler puts two DEAD numbers somethere and calls them foo
Linker uses that pointer foo later in main file. But linker do not know if foo was an array, so it treats it like traditional 8 byte pointer.

answered Jan 26 '15 at 23:36

myaut

11,174
2
30
62

1

This answer has got the right idea but it is not written very well. – Edward Z. Yang Jan 26 '15 at 23:43
Edward, what is the purpose of this exercise? – fukanchik Jan 27 '15 at 00:03
Well, originally, I just wanted to share a distillation of a problem that has given me problems in the past (and has confused multiple colleagues of mine who /really/ should know better)--as other answers have pointed out, you never really have this problem when just C is involved; in the real world this situation invariably involves assembly/linker scripts at one end or another. But what I realized reading these answers is that it's not obviously clear what the best way to explain the *underlying principle* of the problem is. – Edward Z. Yang Jan 27 '15 at 00:17
I think there was a whole section in K&R book devoted to interplay between pointers and arrays. – fukanchik Jan 27 '15 at 00:36
1

_Underlying principle_ is "int v=2; int *p=&v" is memory allocated for a pointer to integer(s). that memory has name 'p', but integers which it points to would have different name and stored independently. There are 4 distinct hings here: 1) integer (uses memory), 2) his name (no memory consumed, just for compilation and linking), 3) pointer (uses memory), 4) pointer name (no memory). – fukanchik Jan 27 '15 at 00:36
1

While "int arr[2]" is memory allocated for two integers and that memory has name arr. Two distinct things are involved in this case: 1) 2 integers (with memory allocated for them) 2) name (arr) (which does not consume memory). If add "static" before the name it won't be visible at all and won't waste space in your compiled binary file. – fukanchik Jan 27 '15 at 00:37
Note - in the first case integer had name - 'v', while in the second two integers are nameless, they can only be referenced via dereferencing the array: arr[1] – fukanchik Jan 27 '15 at 00:39

score 1 · Answer 3 · edited May 23 '17 at 10:30

As the other answer pointed out, you lied to the compiler about the type of foo. Hence your program has undefined behavior, which in this case results in a segmentation fault.

When declaring an variable as extern, you should never put the extern statement into the .c file directly. You should always put the extern statement into a header file and then #include that header in any .c file that needs it. But most importantly, you should always include that header in the .c file that defines the variable, so that the compiler can verify the extern declaration against the variable definition.

So the code should have consisted of the three files shown below

foo.h

#include <stdint.h>
extern uint32_t *foo;

foo.c

#include <stdint.h>
#include "foo.h"
uint32_t foo[2] = {0xDEADBEEF, 0xCAFEFEED};

main.c

#include <stdio.h>
#include <stdint.h>
#include "foo.h"

int main( void )
{
    printf("%p\n", (void *)foo);
    printf("%x\n", *foo);
}

In that case, the error messages that you get from gcc are

foo.c:3: error: conflicting types for ‘foo’

foo.h:2: error: previous declaration of ‘foo’ was here

And, of course, you need to fix the error by fixing foo.h

foo.h

#include <stdint.h>
extern uint32_t foo[];

This strategy is great! But it works less well when a linker script / assembler is serving the purpose of foo.c. — Edward Z. Yang, Jan 27 '15 at 00:24
Yes, but that fundamentally changes the nature of the question. In essence, what you're asking is, "What C type corresponds to a given assembly type?". The programmer that develops the assembly, and creates/maintains the header file needs to be able to answer that question correctly. — user3386109, Jan 27 '15 at 01:01

An extern C pointer puzzler

3 Answers3

foo.h

foo.c

main.c

foo.h