You are stumbling over something interesting: Variables, strictly spoken, are not values, but refer to values. 8
is an integer value. After int i=8
, i
refers to an integer value. The difference is that it could refer to a different value.
In order to obtain the value, i
must be dereferenced, i.e. the value stored in the memory location which i
stands for must be obtained. This dereferencing is performed implicitly in C whenever a value of the type which the variable references is requested: i=8; printf("%d", i)
results in the same output as printf("%d", 8)
. That is funny because variables are essentially aliases for addresses, while numeric literals are aliases for immediate values. In C these very different things are syntactically treated identically. A variable can stand in for a literal in an expression and will be automatically dereferenced. The resulting machine code makes that very clear. Consider the two functions below. Both have the same return type, int. But f
has a variable in the return statement which must be dereferenced so that its value can be returned (in this case, it is returned in a register):
int i = 1;
int g(){ return 1; } // literal
int f(){ return i; } // variable
If we ignore the housekeeping code, the functions each translate into a sigle machine instruction. The corresponding assembler (from icc) is for g
:
movl $1, %eax #5.17
That's pretty starightforward: Put 1 in the register eax.
By contrast, f
translates to
movl i(%rip), %eax #4.17
This puts the value at the address in register rip plus offset i in the register eax. It's refreshing to see how a variable name is just an address (offset) alias to the compiler.
The necessary dereferencing should now be obvious. It would be more logical to write return *i
in order to return 1, and write return i
only for functions which return references — or pointers.
In your example it is indeed illogical to a degree that
int j=8;
int* p = &j;
printf("%d\n", *p);
prints 8 (i.e, p is actually dereferenced twice); but that &(*p)
yields the address of the object pointed to by p (which is the address value stored in p), and is not interpreted as &(8)
. The reason is that in the context of the address operator a variable (or, in this case, the L-value obtained by dereferencing p) is not implicitly dereferenced the way it is in other contexts.
When the attempt was made to create a logical, orthogonal language — Algol68 —, int i=8
indeed declared an alias for 8. In order to declare a variable the long form would have been ref
int m = loc int := 3
. Consequently what we call a pointer or reference would have had the type ref ref int
because actually two dereferences are needed to obtain an integer value.