Analysis — What is wrong with the code
In your code:
file1.c
int main(void)
{
char i; // Warning: unused variable
foo();
}
file2.c
void foo(void)
{
extern int i;
i = 130;
printf("%d", i);
}
The variable i
in main()
is wholly unrelated to the i
referred to in foo()
. The variable i
in main()
is strictly local to main()
and is not passed to foo()
; foo()
is unable to see that i
.
On Mac OS X 10.7.5 with GCC 4.7.1, the files compile separately, but they can't be linked because there is no definition for variable i
.
If we move the variable i
outside of main()
, then the program links:
file2.h
extern void foo(void);
file1.c
#include <stdio.h>
#include "file2.h"
char i = 'A';
int main(void)
{
printf("i = %d (%c)\n", i, i);
foo();
printf("i = %d (%c)\n", i, i);
return(0);
}
file2.c
#include "file2.h"
#include <stdio.h>
void foo(void)
{
extern int i;
i = 130;
printf("i = %d (%c)\n", i, i);
}
This now links and runs, but it invokes undefined behaviour. The linker does not notice that the size of i
should be sizeof(int)
but is sizeof(char)
. The assignment in foo()
, though, goes trampling out of bounds of the i
that is allocated. It doesn't cause visible damage in this program, but that's pure luck.
When I run it on a little-endian machine (Intel x86/64), the output is:
i = 65 (A)
i = 130 (?)
i = -126 (?)
When I ran it on a big-endian machine (SPARC), I got a different result (which isn't very surprising; the behaviour is undefined, so any result is valid):
i = 65 (A)
i = 130 (?)
i = 0 ()
The most significant byte of the 4-byte integer was written over the only byte of the 1-byte character, hence the zero in the third line out output. Note too that it was lucky that the character was allocated an address that was a multiple of 4 bytes; that was not guaranteed and a misaligned address for an int
would have caused a bus error (though experimentation suggests that it doesn't happen on SPARC even when i
is forced onto an odd address; I'm not sure what is happening).
Synthesis — Correct Handling of extern
The correct way to handle this is to avoid writing external variable declarations such as extern int i;
in any source file; such declarations should only appear in a header that is used everywhere the variable needs to be used.
file2.h
extern char i;
extern void foo(void);
With that change in place (and the corresponding extern int i;
removed), then the output is self-consistent:
i = 65 (A)
i = -126 (?)
i = -126 (?)
Note the role of the header in keeping both file1.c
and file2.c
in line with each other. It ensures that the definition in file1.c
matches the declaration in file2.h
, and the use of file2.h
in file2.c
ensures that the declaration used there is correct, too. See also What are extern
variables in C.