0

EDIT This is not a question on array decay from char [1] and char *. I know what array decay is for 1D arrays. It however seems different for char [1][1] and char ** since they are not even compatible types.

I know that I can go from one of the types char [1] and char * to the other. However, it seems not as easy with char [1][1] and char **.

In my main function I have:

int main(void) {
    char a[1][1];
    a[0][0] = 'q';
    printf("a: %p\n", a);
    printf("*a: %p\n", *a);
    printf("**a: %p\n", **a);
}

I'm of course compiling with warnings and I know that gcc complains about the 6th line as **a is actually of type char and not a pointer type. However running the code shows that a and *a are actually the same pointer but **a is as expected something else (0x71 which I assume is related to 'q' in some way).

I'm trying to make sense of this and it seems that because *a and a are equal **a must also be equal to a because **a = *(*a) = *(a) = *a = a. It seems that the only error in this reasoning can be the types of a, *a, **a.

How is a actually stored in memory? If a is a pointer to another memory location (in my case 0x7fff9841f250) then surely *a should be the value at that memory address which in my case is also 0x7fff9841f250. So then **a would be the value at 0x7fff9841f250 which is the same value as 'a'... It seems that I cannot view the 2D array char a[1][1] as pointers in a way that makes sense. But then how can I think of this type? What is the type and what does a, *a, **a, a[0], a[0][0] actually mean?

I have already seen Incompatible pointer type but it does not explain what the operations *a, **a, a[0], a[0][0] are actually doing.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Side notes : Are you new at C++? If so try to learn to work with std::vector/std::array first. In current C++ Double pointers and array pointer decay are rarely used since they cause too much opportunity for out-of-bound and dereferencing pointer bugs. If you are learning "C" then at least remove the "C++" tag. – Pepijn Kramer Aug 12 '23 at 09:25
  • @PepijnKramer Actually I'm working exclusively with C. However the question still applies to c++ since g++ gives exactly the same warnings and outputs as gcc. Should I still remove the c++ tag? – spinosarus123 Aug 12 '23 at 09:26
  • @spinorsaurus yes please, C++ answers can be completely different. C++ is a seperate language (with backward "C" compatibility). So while this "C" code would compile in C++ it will not be the recommended way of working. – Pepijn Kramer Aug 12 '23 at 09:30

3 Answers3

2

a defined as char a[1][1] is an array of 1 array of 1 char: it consists of a single byte of memory set aside by the compiler with automatic storage upon entering the scope of the function main, ie: a single byte of the stack area with an address that happened to be 0x7fff9841f250 when you ran the program, but could be different for a different run or another day, with another compiler, on a different architecture or host...

This piece of memory uses that same space a single char defined as char a; or as an array of 1 char defined as char a[1]; or as a 3D array defined as char a[1][1][1]; but the type is very different and the type determines how and when the memory is actually read as opposed to its address being taken.

The address of a is also the address of the first row a[0] and the address of the first element of the first row a[0][0].

a is the name of the 2D array, when you pass a to a function, it decays to a pointer to its first element, &a[0], which by definition is the address of the array but has type char (*)[1], pointer to arrays of 1 char. Similarly a[0] is an array, so passing a[0] to a function actually passes the address of its first element &a[0][0], which is the same address but with a different type char *.

The syntax *a is evaluated the same way as a[0] or *(a + 0), (and also more surprisingly *(0 + a), and 0[a]). If a was a pointer, its value would be read when you pass a to printf, but a is an array so a decays as &a[0] and the address is passed without reading the memory. Similarly *a becomes *&a[0] or &a[0][0], ie: *a passed as an argument to printf is just the address of the single element a[0][0]. The compiler generates code that computes the address, such as value of the stack pointer register plus 8.

Note these remarks:

  • printf("a: %p\n", a); should be written printf("a: %p\n", (void *)a); because a decays as a pointer of type char (*)[1] which might not be passed the same way nor have the same representation as void *, the type expected by printf for %p.

  • printf("a: %p\n", *a); should be written printf("a: %p\n", (void *)*a); because *a decays as a pointer of type char * which is probably passed the same way and has the same representation as void *, but still is a different type from the type expected by printf for %p.

  • printf("a: %p\n", **a); has undefined behavior because **a is a char value, which is passed as an int after promotion, but is definitely not the type expected by printf for %p. The behavior is undefined. The output is unpredictable and anything else can happen. 0x71 happens to be the ASCII code for 'q' expressed in hexadecimal, but this behavior is not guaranteed. Some other output could be produced on a different system, with different compiler settings, at some other time, and anything else could happen. Writing printf("a: %p\n", (void *)**a); would prevent the undefined behavior, but conversion from int to void * is implementation defined, so the behavior is still not guaranteed.

To summarize: An array has an address and some contents, whereas a pointer has a value that is an address.

Trivial comparison:

  • a building is an array of apartments
  • both the building and each apartment have an address
  • an business card is a pointer: it can be a pointer to a building or a pointer to a single apartment, but it can also be blank (null pointer) or have an obsolete address (stale or dangling pointer) or show a phoney address (invalid pointer).
  • 2D version: a street is an array of buildings...
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Thank you. I think what confused me the most is that a and a[0] are both the same. In my head this would imply that a[0][0] is also the same (by recursion). But if I understand correctly the type then influences how the memory is read which makes a[0][0] different from a and a[0]. If I however cast like "char **b = (char **)a;" Then it would work as I expect and b and b[0] would be distinct. However accessing b[0][0] now becomes undefined behavior and is not actually what I want. Do I really have to explicitly cast to (void *)? I thought the standard said that these are implicit. – spinosarus123 Aug 12 '23 at 12:21
  • `a`, `a[0]` and `a[0][0]` all have the same address. In this particular case, they also have the same size: 1 single byte. `char **b = (char **)a;` is meaningless: `(char **)` is the type of the address of a pointer to `char`, yet there is no pointer in memory where `a` lives, there is not even enough space for a pointer. – chqrlie Aug 12 '23 at 12:50
1

You declared a two-dimensional array

char a[1][1];

Elements of the array a has the type char[1].

In the expression like that *a the array designator a is implicitly converted to a pointer of the type char ( * )[1] to its first element. Dereferencing the pointer expression you get the first (and according to the declaration single) element of the type char[1]. Now dereferencing the expression the second time like **a the expression of the type char[1] again is converted to a pointer of the type char * to its first element. Thus the expression **a yields the first character of the first (single) "row" of the array that has the type char and the value 'q' according to the assignment

a[0][0] = 'q';

According to the C Standard the expression a[0][0] is evaluated like *( *( a + 0 ) + 0 ) that is the same as **a

The C Standard (6.5.2.1 Array subscripting):

2 A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

So this statement

a[0][0] = 'q';

may be rewritten like

**a = 'q';

In this call of printf

printf("a: %p\n", a);

that is better to rewrite like

printf("a: %p\n", ( void * )a);

the expression a having the type char[1][1] is implicitly converted to a pointer to its first element (having the type char[1]) of the type char ( * )[1] and yelds the starting address of the extent of the memory occupied by the array.

In ths call of printf

printf("*a: %p\n", *a);

that again should be rewritten like

printf("*a: %p\n", ( void * )*a);

the expression *a having the array type char[1] is implicitly converted to a pointer to its first element of the type char * and yelds the same address because elements a[0] and a[0][0] are at the same address.

This call of printf

printf("**a: %p\n", **a);

is incorrect because a character (the type of the expression **a) is outputted as an address.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • *character (the type of the expression **a) is outputted as an address* more precisely: a value of type `char`, the type of `**a`, is promoted to `int` and passed as an argument where `printf` expects a `void *` for format `%p`. The behavior is undefined. The output is unpredictable and anything else can happen. `0x71` happens to be the ASCII code for `'q'` expressed in hexadecimal, but this behavior is not guaranteed. Some other output could be produced on a different system, with different compiler settings, at some other time, or anything else could happen. – chqrlie Aug 12 '23 at 10:08
0
  • char* is a pointer to char, it points to a single byte
  • char** is a pointer to a pointer to char, it points to a 32 or 64 bits pointer
  • char[][5] is an array type that turns to a pointer to char[5], it points to an array of 5 bytes

Always remember that a pointer is typed (pointer to...), so that pointer+1 is incremented by the number of bytes pointed to.

Yves Daoust
  • 672
  • 9