Except when it is the operand of the sizeof
or unary &
operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T
" will be converted ("decay") to an expression of type "pointer to T
", and the value of the expression will be the address of the first element of the array.
Assume the following code:
char buffer[] = "Hello";
...
printf( "%s\n", buffer );
In the call to printf
, the expression buffer
has type "6-element array of char
"; since it is not the operand of the sizeof
or unary &
operators, nor is it being used to initialize another array in a declaration, the expression is converted ("decays") to an expression of type "pointer to char
" (char *
), and the value of the expression is the address of the first element in the array.
Now, change the printf
call to
printf( "%p", (void *) &buffer );
This time, buffer
is the operand of the unary &
operator; the automatic conversion to type "pointer to char
" doesn't occur. Instead, the type of the expression &buffer
is "pointer to 6-element array of char
", or char (*)[6]
1 (the parentheses matter).
Both expressions yield the same value -- the address of the array is the same as the address of the first element of the array -- but the types of the two expressions are different. This matters; char *
and char (*)[6]
are not interchangeable.
So, why does this funky conversion magic exist in the first place?
When Dennis Ritchie was initially designing C, he was basing his design on an earlier language named B (go figure). When you allocated an array in B, like so:
auto arr[N];
the compiler would set aside N elements for the array contents, along with an additional cell that stored an offset to the first element of the array (basically a pointer value, but without any sort of type semantics; B was a "typeless" language). This additional cell would be bound to the variable arr
, giving you something like the following:
+---+
arr: | | --+
+---+ |
... |
+---+ |
arr[0]: | | <-+
+---+
arr[1]: | |
+---+
arr[2]: | |
+---+
... ...
+---+
arr[N-1]: | |
+---+
Ritchie initially kept these semantics, but ran into issues when he started adding struct types to C. He wanted struct types to encode their bytes directly; IOW, given a type like
struct {
int inode;
char name[14];
};
he wanted a 2-byte integer immediately followed by a 14-byte array; there wasn't a good place to stash the pointer to the first element of the array.
So he got rid of it; instead of setting aside storage for a pointer to the first element of the array, he designed the language so that the location of the array would be computed from the array expression itself. Hence the rule at the beginning of this post.
1. The %p
conversion specifier expects a void *
expression as its corresponding argument, hence the cast.