1

I wrote this small code to test my understanding. But didn't understand some facts behind it. I am working on a 64 bit little endian machine. So any pointer is 8 bytes. That means

#include<stdio.h>
int main(){
    char *c = (char *)0x12345678889;
    long a = 1;
    int b = (long)(c-a);
    /* int cc = (int)(c-a); gives compiler error */
    printf("val = %x and b = %x", c-a,b);
    return 0;
}
Output
val = 45678888 and b = 45678888

Say the starting address is 100. So the char* would be stored in memory as 100->89, 101->88 ... 105->12 and bytes 106 and 107 will be unused. Is this assumption of mine correct in the first place? Since int and long are 4 bytes in a 64 bit machine, it will start from 100,101,102 and 103 and will consider only these bytes. So 45678889 - 1 = 45678888. Is my understanding correct? Finally, I don't understand while the commented line gives a compiler error. The compiler had implicitly typecasted the above line. But why not below?

dandan78
  • 13,328
  • 13
  • 64
  • 78
CHID
  • 1,625
  • 4
  • 24
  • 43
  • 2
    "int and long are 4 bytes in a 64 bit machine" -- not necessarily. On mine, `sizeof(long) == 8`. – Fred Foo Sep 11 '13 at 16:23
  • A `char*` is an address, which on most machines is `4 bytes` so saying `char *c = (char *) 0x12345678889;` doesn't make sense. I believe the compiler will ignore the trailing `889`. – Scotty Bauer Sep 11 '13 at 16:24
  • 1
    Also, if you have a decent compiler, then you'll get an explanation for the error. My GCC says "cast from pointer to integer of different size", which is pretty descriptive. – Fred Foo Sep 11 '13 at 16:25
  • 3
    @ScottyBauer: on a 64-bit a `char*` is 8 bytes. Otherwise, there wouldn't be much 64-bit about it. – Fred Foo Sep 11 '13 at 16:25
  • Note that you can build and run an app as 32-bit on a 64-bit machine. Just because the hardware is 64-bit it does not mean that the pointer in an application is 64-bit. – Mark Wilkins Sep 11 '13 at 16:27
  • @larsmans Thank you. I guess that was the assumption I made which went wrong. Since long is 8 bytes in my machine, the compiler dint throw an error is it? So in general can we assume, typecasting from a larger data type to smaller data type is not allowed in C? – CHID Sep 11 '13 at 16:28
  • 1
    @CHID: casting from a pointer to a smaller datatype is unsafe and senseless. When storing pointers in integers, always use `(u)intptr_t` from ``. – Fred Foo Sep 11 '13 at 16:29
  • @CHID, not quite. The compiler will stop you from casting a pointer to a smaller integral type (`c-a` is a pointer, due to pointer arithmetic). But you could cast a `long` to an `int` (and get nothing but a warning). It's not safe either way. – StoryTeller - Unslander Monica Sep 11 '13 at 16:31
  • @larsmans @StoryTeller - Thank you guys. Just one more question. since `int b = (long)(c-a); ` worked, is it the case that the compiler typecasted an int ( 4 bytes ) to long ( 8 bytes ) by truncating the higher order 4 bytes? I now understood it is unsafe. But just for learning purposes I am asking – CHID Sep 11 '13 at 16:34
  • One more thing to keep in mind is that `long` is 8 bytes on Unix, but 4 bytes on Windows. (http://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows) – firefrorefiddle Sep 11 '13 at 16:42
  • @AndreyT: The standard allows substracting integers from pointers. – interjay Sep 11 '13 at 16:54
  • @AndreyT: C11 6.5.6.3 says binary subtraction is fine if "the left hand operand is a pointer to a complete object type and the right operand has integer type", obviously necessary for pointer arithmetic. – Crowman Sep 11 '13 at 16:54
  • 2
    @Paul Griffiths: Sorry, epic brainglitch on my part. Of course, it is perfectly legal. I don't know what I was thinking. – AnT stands with Russia Sep 11 '13 at 16:59

1 Answers1

1

First you didn't know a-priori how values are stored, it depends on machine endianess. Values can be stored from lowest to highest digits or in the converse. Yours is from low to high.

Second, on many 64-bit machines int are 4 bytes long and long 8 bytes long. What happens is that your machine computes c-a which is 0x12345678888 of type (char *), then it convert it into a long with value 0x12345678888 and then truncate it silently (longs are silently converted to ints in C).

The comment gives a warning because c-a is of type char * and that cannot be silently converted into an int (it is considered much dangerous dans long to int). Ok then you converted it explicitly but the compiler warns you that it is dangerous too. Note that these are warnings not errors (it may depend on compiler options...).

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69
  • Why should long be silently typecasted to int? Long is 8 bytes right. Why does not the compiler stores all the 8 bytes in the long variable – CHID Sep 11 '13 at 16:41
  • That's the way C is defined. longs can be assigned to ints. In that case the assignment truncates the value to the lowest bytes (here the lowest 4 bytes 0x45678888). The compiler stores the value in the long variable (a temporary one) but you assigned it to an int. – Jean-Baptiste Yunès Sep 11 '13 at 16:53
  • 1
    @AndreyT No `c-a` is correct! It is pointer arithmetic, which is well defined in C. It computes a new pointer values relative to a given one. It gets `a` steps back from pointer `c`. As `c` in `char *`, length of a step is length of `char`. – Jean-Baptiste Yunès Sep 11 '13 at 16:56
  • @Jean-Baptiste Yunès: Sorry, epic brainglitch on my part. Of course, it is perfectly legal. I don't know what I was thinking. – AnT stands with Russia Sep 11 '13 at 16:59
  • @Jean-BaptisteYunès Thanks man. Now am pretty clear with assigning to int and its print. But the direct print `printf("%x",c-a)` should have produced full 8 bytes right? Why did it truncate the top 4 bytes. I did not type cast it anywhere. Even when i print like this `printf("%x",(char *)(c-a));` it is printing only 4 bytes. Why is that. Again, its all for my learning purpose, regardless of the fact that it is senseless or not – CHID Sep 11 '13 at 17:03
  • @CHID `%x` prints any value as an `int`, ok `c-a` is a long but you silently and hiddenly convert it to an `int`. If you want to see it as a long you must at least use `%lx` (a warning may occur), but you should know that a pointer must be printed with `%p` (no warning, correct value). – Jean-Baptiste Yunès Sep 11 '13 at 17:17
  • 1
    @Jean-Baptiste Yunès: I'm not sure why you say `c-a` is `long`. `c-a` is a `char *`, which has nothing to do with `long`. Also, specifying `%x` format for a pointer argument will not "silently convert it to `int`". Instead it will result in undefined behavior, which in practice usually performs a *reinterpretation* of the corresponding parameter, not a *conversion*. On top of that, specifying format that implies argument of different size can easily derail/misalign the variadic argument scan performed by `printf`, causing further output to become completely garbled. – AnT stands with Russia Sep 11 '13 at 21:33
  • @AndreyT right `c-a` is not long, it's `char *` but on 64-bits arch it's almost the same (LP64 model), I would mean that both are 64-bits values, long are signed and pointer unsigned. I also simplified my answer for the variadic part (this is technically difficult), but you are right again : it is a reinterpretation of memory content (something like dereferencing an pointer after a pointer type conversion - or union usage). – Jean-Baptiste Yunès Sep 14 '13 at 09:17