1

After reading this: 2D array and pointer in C - how to access elements?

I'm still confused about the 'Ptr = *data' not being a de-reference.

Relevant code from that question:

int data[4][3] = { {23,55,50},{45,38,55},{70,43,45},{34,46,60}};
int *Ptr;

Ptr = *data;   //  This is not a de-reference.  Ptr is not 23.  Why?

I don't understand this on a conceptual level. I the see ' * ' and I think de-reference.

Saying Ptr = *(data + 0), doesn't help, to me it still means de-reference and we end up de-referencing the address 'data'.

Can anyone shed some light and help me understand?

Thank you for your time.

FiveBlue
  • 35
  • 4
  • C does not have 2D arrays. `data` is a 1D array, where each element is a 1D array of ints. – stark Mar 27 '21 at 14:38
  • What makes you think `*data` is not a dereference? – tadman Mar 27 '21 at 14:39
  • @stark I think you mean it has two forms of 2D arrays, one where it's contiguous, another where it's an array of pointers to other arrays. – tadman Mar 27 '21 at 14:39
  • @tadman, although I'm not sure I would agree with stark that C arrays of arrays should not be characterized as "2D arrays", I do generally take the position that arrays of pointers should not be characterized that way. – John Bollinger Mar 27 '21 at 14:48
  • @JohnBollinger Although there's an infinite amount of room for pedantry here, `a[x][y]` is arguably a 2D array structure regardless of the implementation details. – tadman Mar 27 '21 at 15:33

4 Answers4

2

data is two-dimensional array and decals to int (*)[3] pointer. *data is three elements int array and decals to int * pointer.

Ptr contains a reference to the first element of that int[3] array which is 23.

You need to dereference this pointer to get the integer 23

int main(void)
{
    int data[4][3] = { {23,55,50},{45,38,55},{70,43,45},{34,46,60}};
    int *ptr;

    ptr = *data;   //  This is not a de-reference.  Ptr is not 23.  Why?

    printf("%d\n", *ptr);

}
0___________
  • 60,014
  • 4
  • 34
  • 74
2

*data is technically a dereference, but, as we will see, it does not actually cause an access to the object it references.

data is an array of arrays. When an array is used in an expression, it is automatically converted to a pointer to its first element, except when it is used as the operand of sizeof or unary & or is a string literal used to initialize an array. In *data, this conversion effectively changes data to &data[0], which is a pointer to the first array of data.

Since &data[0] is a pointer to the first array, *data is that array. So, we have dereferenced &data[0] to get the array it references.

However, since *data is an array, it is also automatically converted to a pointer to its first element. So, instead of *data causing an access to the array, it is changed to &(*data)[0]. With the first conversion included, that is &(*&data[0])[0]. Then, removing the self-canceling *&, that is &(data[0])[0] or just &data[0][0].

So *data is &data[0][0], the address of the first element of the first array of data.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • 1
    ***`it does not actually cause an access to the object it references`*** it does. But it references 3 elements int array. That derefenced array when used as pointer decals to `int *`, so additional dereference needed – 0___________ Mar 27 '21 at 14:52
2

Unless it is the operand of the sizeof or unary & operator, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T" will be converted, or "decay", to an expression of type "pointer to T" and the value of the expression will be the address of the first element in the array.

Let's draw out the array with its values, showing the address of each element. For this diagram we're assuming 4-byte ints and the addresses are made up out of thin air:

Address      Value    Expression                  
-------      +----+   ----------
 0x4000      | 23 |   data[0][0] 
             + -- +
 0x4004      | 55 |   data[0][1]
             + -- +
 0x4008      | 50 |   data[0][2]
             +----+
 0x400c      | 45 |   data[1][0]
             + -- +
 0x4010      | 38 |   data[1][1]
             + -- +
 0x4014      | 55 |   data[1][2]
             +----+
 0x4018      | 70 |   data[2][0]
             + -- +
 0x401c      | 43 |   data[2][1]
             + -- +
 0x4020      | 45 |   data[2][2]
             +----+
 0x4024      | 34 |   data[3][0]
             + -- +
 0x4028      | 46 |   data[3][1]
             + -- +
 0x402c      | 60 |   data[3][2]
             +----+

Now we can talk about the types of various expressions.

Let's start with the expression data. As declared, the expression has type "4-element array of 3-element array of int" (int [4][3]). Unless it is the operand of the sizeof or unary & operators, it will "decay" to an expression of type "pointer to 3-element array of int" (int (*)[3]) and will evaluate to the address 0x4000.

Next, we look at the expression *data. We already know that data has type "pointer to 3-element array of int" (int (*)[3]), so by dereferencing it, we get an expression of type "3-element array of int" (int [3]). Unless this expression is the operand of the sizeof or unary & operators, it will "decay" to type "pointer to int" (int *) and it will also evaluate to the address 0x4000.

Now let's look at the expressions data[0], data[1], data[2], and data[3]. Remember that the array subscript expression a[i] is exactly equivalent to the expression *(a + i) - given a starting address a, offset i elements (not bytes!) from that address and dereference the result. Since data points to a 3-element array of int, data + 1 will point to the next 3-element array of int. So if data == 0x4000, then data + 1 == 0x400c, data + 2 == 0x4018, and data + 3 == 0x4024. Since data[i] == *(data + i), the type of each data[i] will be int [3] (just like *data above, which is exactly equivalent to data[0] since *data == *(data + 0) == data[0]).

Hopefully each data[i][j] is obvious at this point - each data[i] has type int [3], which "decays" to int *, and data[i][j] == *(data[i] + j).

A few other expressions of note:

  • &data has type "pointer to 4-element array of 3-element array of int", or int (*)[4][3], and its value is also 0x4000 because the address of an array is the same as the address of its first element.

  • Similarly, &data[i] has type "pointer to 3-element array of int" (int (*)[3]); like the expression above, the address of the array is the same as the address of the first element, so &data[0] == data[0] == 0x4000, &data[1] == data[1] == 0x400c, etc.

To summarize:

Expression        Type          "Decays" to        Value
----------        ----          -----------        ------
      data        int [4][3]    int (*)[3]         0x4000
     *data        int [3]       int *              0x4000
     &data        int (*)[4][3] n/a                0x4000
   data[i]        int [3]       int *              0x4000, 0x400c, 0x4018, 0x4024
  *data[i]        int           na/                23, 45, 70, 34
  &data[i]        int (*)[3]    n/a                0x4000, 0x400c, 0x4018, 0x4024
data[i][j]        int           n/a                23, 55, 50, ...           
John Bode
  • 119,563
  • 19
  • 122
  • 198
1

I am posting another example, a complete C program, that may help to see the logic behind this

First, some details

I'm still confused about the 'Ptr = *data' not being a de-reference.

Well, it is a de-reference, the * in an assignment is a de-reference in C. BUT *data is not int: it is int[4][3] and holds the address of the first element of the array, and it points to an int whose is 23. One level of indirection was lost in your code.

      Ptr  is int*
     *Ptr  is an address, &data[0][0] or simply *data
    **Ptr  is 23

Key to understand this is the line

    Ptr = *data;

See what gcc says when compiling this:

toninho@DSK-2009:~/projects/dsp$ gcc -o tptr -Wall -Wextra -std=c17 tptr.c

tptr.c: In function ‘main’:
    tptr.c:19:9: warning: assignment to ‘int *’ from\
 incompatible pointer type int (*)[4][3]’ [-Wincompatible-pointer-types]
    19 |     Ptr = &data;
      |         ^

The Microsoft compiler says:

1>------ Build started: Project: sop-0328-a, Configuration: Debug Win32 ------
1>ptr.c
1>C:\Users\toninho\source\repos\sop-0328-a\sop-0328-a\ptr.c(19,20):
    warning C4047: '=': 'int *' differs in levels of indirection from 'int (*)[4][3]'
1>sop-0328-a.vcxproj -> C:\Users\toninho\source\repos\sop-0328-a\Debug\sop-0328-a.exe
1>Done building project "sop-0328-a.vcxproj".
========== Build: 1 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========

clang compiler says:

C:\Users\toninho\source\repos\sop-0328-a\sop-0328-a>clang -Wall ptr.c
ptr.c:19:13: warning: incompatible pointer types assigning
 to 'int *' from 'int (*)[4][3]'
      [-Wincompatible-pointer-types]
    Ptr = &data;
            ^ ~~~~~
1 warning generated.

And it is the same. You should always enable all warnings, on all compilers.

I don't understand this on a conceptual level. I the see ' * ' and I think de-reference.

Saying Ptr = *(data + 0), doesn't help, to me it still means de-reference and we end up de-referencing the address 'data'.

Can anyone shed some light and help me understand?

As I said, it is a de-reference. Problem is you passed a int*[4][3] to an int*, did not have the compiler warnings enabled and :) was not aware of this thing. Professionals do that all the time and sometimes get surprised too. You should write Ptr = (int*) *data;

I believe the program below can show it better. Adding to 0 sure made no difference. You would need to add another *

Please see the code below and ask back if it is still not clear.

Note that at the end of the code the lines

    pointer = (int*) &many;
    printf("Using a cast 'pointer' now points to '(int*) many' \
and its value is %d\n", *pointer);

uses a cast and prints

Using a cast 'pointer' now points to '(int*) many' and its value is 23

and now he compiler is happy and the output is the expected 23

Example

The output of the program is

'one' is an int at address      00EFFDB4
'many'is int[4][3] at address   00EFFD7C
'pointer' is int* at address    00EFFDC0

'pointer' now points to 'one' and its value is  00EFFDB4
'pointer' now points to 'many' and its value is 00EFFD7C

'many' is a C array, a pointer pointing to the address  00EFFD7C ( &many[0][0] )
'many' content, an address, is  00EFFD7C ( *many )

Starting at this location --- 00EFFD7C ( *many ) --- we have the values of the array
First value is 23 ( **many ),
Second value is 55 ( *(1 + *many)... )
Usng a cast 'pointer' now points to '(int*) many' and its value is 23

And you can see the addresses on the debugger screen, along with the actual types of the variables (program stopped at line #23):

enter image description here

The code

#include <stdio.h>

int main(void)
{
    int*    pointer = NULL;
    int     one = 1;
    int     many[4][3] =
    {
        {23,55,50},
        {45,38,55},
        {70,43,45},
        {34,46,60}
    };
    printf("'one' is an int at address\t%p\n", &one);
    printf("'many'is int[4][3] at address\t%p\n", &many);
    printf("'pointer' is int* at address\t%p\n\n", &pointer);
    pointer = &one;
    printf("'pointer' now points to 'one' and its value is\t%p\n", pointer);
    pointer = &many;
    printf("'pointer' now points to 'many' and its value is\t%p\n", pointer);
    printf("\n'many' is a C array, a pointer pointing to the address\t%p ( &many[0][0] )\n", &many[0][0]);
    printf("'many' content, an address, is\t%p ( *many )\n", *many);
    printf("\nStarting at this location --- %p ( *many ) --- we have the values of the array\n\
First value is %d ( **many ),\n\
Second value is %d ( *(1 + *many)... )\n", *many, **many, *(1 + *many) );

    pointer = (int*) &many;
    printf("Using a cast 'pointer' now points to '(int*) many' \
and its value is %d\n", *pointer);

    return 0;

}
arfneto
  • 1,227
  • 1
  • 6
  • 13