51

I am learning C programming language, I have just started learning arrays with pointers. I have problem in this question, I hope the that output must be 5 but it is 2, Can anyone please explain why?

int main(){
   int arr[] = {1, 2, 3, 4, 5};
   char *ptr = (char *) arr;
   printf("%d", *(ptr+4));
   return 0;
}
Mayank Tiwari
  • 2,974
  • 5
  • 30
  • 52
  • 12
    If anyone votes down this question then please mention your comment, it is very tough for me, hopefully not for others.... :) – Mayank Tiwari Jul 02 '13 at 11:07
  • 12
    You have an "int" array, but a "char" pointer. – Lucas Jul 02 '13 at 11:10
  • 7
    +1 on your announcement :) – Dayal rai Jul 02 '13 at 11:10
  • 11
    By the way, you would have expected your output to be 5, not 4: `*ptr` points to the first element of the array, which is 1, four elements after it you have 5. 5 would have been the output if you had defined `int *ptr = (int *) arr;` – Antonio Jul 02 '13 at 12:34
  • 1
    I would suggest changing the topic of this question to something more useful. This question may get asked in the future. I'd suggest *My char pointer points to ivalid value after being cast from int\**" or sth like that. I'd edit the question, but there's too many upvotes and I don't want to mess with such a popular qustion. – Dariusz Jul 03 '13 at 07:58
  • 1
    @Dariusz What about "Char pointer returns unexpected value after being cast from int*"? – Andreas Fester Jul 03 '13 at 08:24
  • @Dariusz, I am totally agree with you, but generally when people feel problems then they try to ask them, rather than thinking that how to ask them effectively...... :) – Mayank Tiwari Jul 03 '13 at 08:27
  • @user2320537 well I know that, and mind that I didn't rage at you or anything. But since this question has received such recognition, I think you ought to rethink the title to help other people who have the same problem. – Dariusz Jul 03 '13 at 08:40
  • @Dariusz, sure sir, i will remember this from next time, and thanks for this invaluable suggestion... :) – Mayank Tiwari Jul 03 '13 at 08:46
  • @user2320537 ... you still haven't changed the topic! You can edit the post! – Dariusz Jul 03 '13 at 08:56

7 Answers7

81

Assumed a little endian architecture where an int is 32 bits (4 bytes), the individual bytes of int arr[] look like this (least significant byte at the lower address. All values in hex):

|01 00 00 00|02 00 00 00|03 00 00 00|04 00 00 00|05 00 00 00
char *ptr = (char *) arr;

Now, ptr points to the first byte - since you have casted to char*, it is treated as char array onwards:

|1|0|0|0|2|0|0|0|3|0|0|0|4|0|0|0|5|0|0|0
 ^
 +-- ptr

Then, *(ptr+4) accesses the fifth element of the char array and returns the corresponding char value:

|1|0|0|0|2|0|0|0|3|0|0|0|4|0|0|0|5|0|0|0
         ^
         +-- *(ptr + 4) = 2

Hence, printf() prints 2.

On a Big Endian system, the order of the bytes within each int is reversed, resulting in

|0|0|0|1|0|0|0|2|0|0|0|3|0|0|0|4|0|0|0|5
         ^
         +-- *(ptr + 4) = 0
Andreas Fester
  • 36,091
  • 7
  • 95
  • 123
  • 3
    You mix your endianess, little endian have the lowest by in the lowest memory location. – Some programmer dude Jul 02 '13 at 11:34
  • 3
    Your description of how `printf()` handles this case is incorrect. It's getting the value of the 5th character as an integer, not a pointer to same. Also, as Joachim says, your endianness is flipped. – Hasturkun Jul 02 '13 at 11:44
13

It's because the size of char is one, and the size of int is four. This means that adding 4 to ptr makes the result point to the second entry in the int array.

If you compiled this on a big endian system you would have printed 33554432 instead.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • 1
    I thought that endian only applied to floating point. Are you suggesting that integers behave in the same way? I'm not so sure. – Bathsheba Jul 02 '13 at 11:15
  • 5
    Wherever there is a type that uses more than 1 byte there is a question of what order to store the bytes in. – Joe Jul 02 '13 at 11:16
  • 9
    @Bathsheba Endianess is *definitely* an integer problem. – Some programmer dude Jul 02 '13 at 11:17
  • 1
    @Joachim Pileborg; +1 then and my lunchtime reading plans changed: see http://en.wikipedia.org/wiki/Endianness – Bathsheba Jul 02 '13 at 11:19
  • @JoachimPileborg On a big endian system, would this not also result in 2 since printf() interprets the bytes also in the other direction? – Andreas Fester Jul 02 '13 at 11:24
  • @Andreas It does interprets the _bytes_ in the reverse direction, not the _bits_ in each bytes. `0x0004` becomes `0x4000`, not `0x2000`. – Matthieu Rouget Jul 02 '13 at 11:26
  • 1
    @MatthieuRouget Right, what I meant is 2 in LE is `00 00 00 02`, and `02 00 00 00` in BE - but still, `02 00 00 00` would be passed to printf(), which should again interpret it as 2 on a BE machine ... – Andreas Fester Jul 02 '13 at 11:29
  • @Andreas: Yes OK, your are right. If the array is big endian in the first place, it should also be 2 with %d. I misunderstood your comment. – Matthieu Rouget Jul 02 '13 at 11:32
  • @Andreas No, an `int` with value 2 is `02 00 00 00` in little endian. The lowest byte comes at the lowest address. – Some programmer dude Jul 02 '13 at 11:32
  • 2
    @JoachimPileborg +1 - I remember that I **always** mess them up :D – Andreas Fester Jul 02 '13 at 12:02
  • @Andreas I do that as well, especially since I started my career on a big-endian platform. Little-endian just feels... wrong! :) – Some programmer dude Jul 02 '13 at 12:04
  • 3
    ... and, I think that you were right in the first place: On big endian, the result should be zero, since you are accessing the fifth byte as `char` and this byte is zero... sorry for that ... – Andreas Fester Jul 02 '13 at 12:18
  • @JoachimPileborg: So will the output be 0 or 33554432 on big endian machine? –  Jun 13 '14 at 12:56
3
int main(){
 int arr[] = {1,2,3,4,5};
 char *ptr = (char *) arr;
 printf("%d",*(ptr+4));
 return 0;
}

Each case of arr has sizeof(int) size (which may be 4 on your implementation).

Since ptr is a pointer to char, pointer arithmetic makes ptr + 4 points 4 bytes after &arr[0], which may be &arr[1].

In memory, it looks like something like:

Address | 0 1 2 3 | 4 5 6 7 | ...
Value   |  arr[0] |  arr[1] | ...
md5
  • 23,373
  • 3
  • 44
  • 93
2

On a 32 bit platform, int is four times the size of char. When you add 4 to ptr, you add 4 times the size of what ptr points to to ptr (which itself is a memory location). That happens to be the address of the second element in the int array.

On a 64 bit platform, int is eight times the size of char; and your output would be very different.

To cut a long story short, your code is not portable, (also see Joachim Pileborg's answer re endianness) but amusing to unpick.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1
    `int` storage size is implementation dependent. On most 64 bits platforms, `sizeof(int)` is 4. See http://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows for more infos. – Matthieu Rouget Jul 02 '13 at 11:53
2

What you do is definitely not recommended in production code, but is definitely great for understanding pointers, casts, etc. in the learning process, so for this your example is great. So, why you get 2. It is because your array is an array of ints, which depending on your architecture has different size (in your case, sizeof(int) is 4). You define ptr as being a char pointer, char has size 1 byte. Pointer arithmetics (that's what you do when you write ptr+4) works with size of objects the pointer references, in your case with chars. Thus ptr+4 is 4 bytes away from the beginning of your array, and thus at the 2nd position of your int array. That is it. Try ptr+5, you should get 0.

ondrejdee
  • 476
  • 2
  • 8
1

Since you are coverting int* to char*, ptr[0] = 1, ptr[4] = 2, ptr[8] = 3, ptr[12] = 4 , ptr[16] = 5 and all others equal to 0. ptr+4 points to 4th element in the ptr array. So result is 2.

dijkstra
  • 1,068
  • 2
  • 16
  • 39
  • that depends on the endianness of your system. You didn't explain why it works like that and why it's correct on your machine. – Joe Jul 02 '13 at 11:22
1
int main(){
 int arr[] = {1,2,3,4,5};
 char *ptr = (char *) arr;
 printf("%d",*(ptr+4));
 return 0;
}

Imagine arr is stored at the address 100 (totally dumb address). So you have: arr[0] is stored at the address 100. arr[1] is stored at the address 104. (there's is +4 because of the type int) arr[2] is stored at the address 108. arr[3] is stored at the address 112. Etc etc.

Now you're doing char *ptr = (char *) arr;, so ptr = 100 (the same as arr). The next statement is interesting, specially the second argument of printf : *(ptr+4). Keep in my mind that ptr = 100. So ptr + 4 = 104, the same address that arr[1] ! So it will print the value of arr[1], which is 2.

nouney
  • 4,363
  • 19
  • 31