-8

What is the output of following program written in C?

Is it 2 0 or 0 2, and why?

int main()
{
    int arr[]={2,3,4};  // predefined pointer
    char *p;
    p=(char *)arr;

    printf("%d\n",*p);
    printf("%p\n",p);
    p=p+1;

    printf("%d\n",*p);
    printf("%p\n",p);

    return 0;

}
dbush
  • 205,898
  • 23
  • 218
  • 273
  • Where is this code from? it looks like an interview or test question. Did you test the program and now you can't explain the output to yourself? – Iharob Al Asimi Aug 10 '18 at 12:13
  • What is the output on your machine? The real question would be `2 0` or `0 0`. Why should it ever print `0 2`? – Gerhardh Aug 10 '18 at 13:09
  • Why can't you execute it and see for yourself? Don't you understand the result you are getting, or what? – Lundin Aug 10 '18 at 13:11
  • @Lundin, that's not good advice for undefined or implementation-defined behaviour such as this. The results aren't specified by the language and depend on the size and layout of `char` and `int`. – Toby Speight Aug 10 '18 at 13:21
  • @Gerhadh, `0 2` is a likely result on MC6800 platforms (possibly also on VAX; my memory is slightly hazy there). – Toby Speight Aug 10 '18 at 13:23
  • 1
    @Gerhardh Why is 68000 relevant? I know that's 32-bit. I mentioned 6800, where `int` is usually 16 bits and `char` 8. – Toby Speight Aug 10 '18 at 14:57
  • 1
    @Gerhardh: 68000 is a 16 bit CPU and typically C on that arch used 16 bit `int`. Only 68020 and higher used 32 bit `int` on some platforms (typically Unix systems). – too honest for this site Aug 10 '18 at 15:22
  • @TobySpeight sorry, I messed up the CPUs. – Gerhardh Aug 10 '18 at 15:32

2 Answers2

6

The result depends on the endianness of your system as well as the size of an int (it also depends on the number of bits in a byte, but for now we'll assume it's 8).

Endianness dictates the ordering of bytes in types such as integers. x86 based processors are little-endian, meaning that the least significant byte is first, while others are big-endian meaning the most significant byte is first.

For example, for a variable of type int with the value 2, and assuming an int is 32 bit, the memory on a big-endian system looks like this:

-----------------
| 0 | 0 | 0 | 2 |
-----------------

While on a little-endian system it looks like this:

-----------------
| 2 | 0 | 0 | 0 |
-----------------

Moving on to what happens when you take a char * and point it to an int (or a member of an int array). Normally, using a pointer to one type to point to another type and read the value though the other pointer is a strict aliasing violation which invokes undefined behavior, however the C standard has an exception for character types to allow you to access the bytes in an object's representation. So in this case it's allowed.

When you do this:

p=(char *)arr;

It causes p to point to the first byte of the first member of the array arr.

On big endian systems:

-----
| . | p
-----
  |
  v
-------------------------------------------------
| 0 | 0 | 0 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

On little endian:

-----
| . | p
-----
  |
  v
-------------------------------------------------
| 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

So when you read the value of *p you'll get 0 on big endian systems and 2 on little endian systems.

When you then perform p=p+1, you increase the address p points to by 1 character, i.e. 1 byte, so now it looks like this:

Big endian:

-----
| . | p
-----
  |----
      v
-------------------------------------------------
| 0 | 0 | 0 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

Little endian:

-----
| . | p
-----
  |----
      v
-------------------------------------------------
| 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

Now *p contains the value 0 on both big endian and little endian systems. This assumes however that an int is 32-bit. If an int is 16 bit, it instead looks like this:

Big endian:

-----
| . | p
-----
  |----
      v
-------------------------
| 0 | 2 | 0 | 3 | 0 | 4 | arr
-------------------------
|arr[0] |arr[1] |arr[2] |
-------------------------

Little endian:

-----
| . | p
-----
  |----
      v
-------------------------
| 2 | 0 | 3 | 0 | 4 | 0 | arr
-------------------------
|arr[0] |arr[1] |arr[2] |
-------------------------

In this case *p is 2 on big endian systems and 0 on little endian systems after incrementing.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • Technically, the answer also depends on `CHAR_BIT` - easily forgotten if you only work with small-char systems... For example, with `char` and `int` both 16 bits wide, `2 3` is another possible result. – Toby Speight Aug 10 '18 at 15:00
1

This

int arr[]={2,3,4};

looks like below if your system supports little endian, in case of big endian output may vary.

 arr[2]      arr[1]   |------------arr[0]-----------------------------|  
 ----------------------------------------------------------------------
|     4      |   3    | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0010  | 
 ----------------------------------------------------------------------
           0x108    0x104       0x103      0x102       0x101       0x100 -- assume arr base address starts from 0x100
                                                                     arr
MSB                                                                  LSB

Now when you do

char *p;
p=(char *)arr;

Here p is a char pointer & arr type casted as char* which means pointer p points to one byte memory location at a time i.e first time 0x100 to 0x101.

When the statement

printf("%d\n",*p);

executes it prints what data is there in 0x100-0x101 location which is 2, hence it prints 2.

And next when you do

p=p+1;

the pointer p increments by one byte i.e now p points to 0x101 memory location and when the statement printf("%d\n",*p); executes it prints what data is there in 0x101-0x102 location which is 0, hence it prints 0.

Also while using %p you should typecast pointer variable as void* as printf("%p") and casting to (void *)

printf("%p\n",(void*)p);
Achal
  • 11,821
  • 2
  • 15
  • 37