12

I saw some usage of (void*) in printf().

If I want to print a variable's address, can I do it like this:

int a = 19;
printf("%d", &a);
  1. I think, &a is a's address which is just an integer, right?
  2. Many articles I read use something like this:

    printf("%p", (void*)&a);
    

  1. What does %p stand for? (A pointer?)
  2. Why use (void*)? Can't I use (int)&a instead?
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Alcott
  • 17,905
  • 32
  • 116
  • 173
  • 2
    Does `sizeof(void*) == sizeof(int)` on every platform? – Keith Layne Sep 03 '11 at 03:18
  • 5
    @keith: Absolutely not. The code in this question is a classic case of making unwarranted assumptions about pointer and int length. – Nicholas Knight Sep 03 '11 at 03:21
  • @keith.layne - I can't think of a system where `sizeof(void*)` would be different than `sizeof(int*)` (of course sizeof(int) can -- and often is, be different.). – Hogan Sep 03 '11 at 03:22
  • @keith, yes, I tried, they are equal. – Alcott Sep 03 '11 at 03:24
  • 1
    @Alcott: Which does not, by any stretch of the imagination, mean they will be equal anywhere else. As soon as you try to run your code on most 64-bit systems, it will break. – Nicholas Knight Sep 03 '11 at 03:26
  • Why are people upvoting @Nicholas' comments? They are equal on 64-bits too. – Blindy Sep 03 '11 at 03:32
  • @Blindy: No, they are not. Different 64-bit platforms won't even behave the same. See, for example, http://en.wikipedia.org/wiki/64-bit#64-bit_data_models – Nicholas Knight Sep 03 '11 at 03:38
  • @Nicholas, why they will be different on different platforms? I assume that a pointer is just an integer, right? – Alcott Sep 03 '11 at 03:39
  • @Alcott: You assume wrong. Read the page I linked to. *Thoroughly*. Pointers and integers have no inherent relationship. – Nicholas Knight Sep 03 '11 at 03:40
  • 2
    @Alcott: No, pointers are *not* integers. Pointers are pointers. Integers are integers. – Keith Thompson Sep 03 '11 at 03:56
  • 3
    Even if different pointer types are the same *size*, they're different types. `printf` requires the exact right argument types matching the format string. Other types *may* work, depending on implementation details, but it's undefined behavior. – R.. GitHub STOP HELPING ICE Sep 03 '11 at 04:46

4 Answers4

18

Pointers are not numbers. They are often internally represented that way, but they are conceptually distinct.

void* is designed to be a generic pointer type. Any pointer value (other than a function pointer) may be converted to void* and back again without loss of information. This typically means that void* is at least as big as other pointer types.

printfs "%p" format requires an argument of type void*. That's why an int* should be cast to void* in that context. (There's no implicit conversion because it's a variadic function; there's no declared parameter, so the compiler doesn't know what to convert it to.)

Sloppy practices like printing pointers with "%d", or passing an int* to printf with a "%p" format, are things that you can probably get away with on most current systems, but they render your code non-portable. (Note that it's common on 64-bit systems for void* and int to be different sizes, so printing pointers with %d" is really non-portable, not just theoretically.)

Incidentally, the output format for "%p" is implementation-defined. Hexadecimal is common, (in upper or lower case, with or without a leading "0x" or "0X"), but it's not the only possibility. All you can count on is that, assuming a reasonable implementation, it will be a reasonable way to represent a pointer value in human-readable form (and that scanf will understand the output of printf).

The article you read is entirely correct. The correct way to print an int* value is

printf("%p", (void*)&a);

Don't take the lazy way out; it's not at all difficult to get it right.

Suggested reading: Section 4 of the comp.lang.c FAQ. (Further suggested reading: All the other sections.

EDIT:

In response to Alcott's question:

There is still one thing I don't quite understand. int a = 10; int *p = &a;, so p's value is a's address in mem, right? If right, then p's value will range from 0 to 2^32-1 (if cpu is 32-bit), and an integer is 4-byte on 32-bit OS, right? then What's the difference between the p's value and an integer? Can p's value go out of the range?

The difference is that they're of different types.

Assume a system on which int, int*, void*, and float are all 32 bits (this is typical for current 32-bit systems). Does the fact that float is 32 bits imply that its range is 0 to 232-1? Or -231 to 231-1? Certainly not; the range of float (assuming IEEE representation) is approximately -3.40282e+38 to +3.40282e+38, with widely varying resolution across the range, plus exotic values like negative zero, subnormalized numbers, denormalized numbers, infinities, and NaNs (Not-a-Number). int and float are both 32 bits, and you can take the 32 bits of a float object and treat it as an int representation, but the result won't have any straightforward relationship to the value of the float. The second low-order bit of an int, for example, has a specific meaning; it contributes 0 to the value if it's 0, and 2 to the value if it's 1; the corresponding bit of a float has a meaning, but it's quite different (it contributes a value that depends on the value of the exponent).

The situation with pointers is quite similar. A pointer value has a meaning: it's the address of some object (or any of several other things, but we'll set that aside for now). On most current systems, interpreting the bits of a pointer object as if it were an integer gives you something that makes sense on the machine level. But the language itself does not guarantee, or even hint, that that's the case.

Pointers are not numbers.

A concrete example: some years ago, I ran across some code that tried to compute the difference in bytes between two addresses by casting to integers. It was something like this:

unsigned char *p0;
unsigned char *p1;
long difference = (unsigned long)p1 - (unsigned long)p0;

If you assume that pointers are just numbers, representing addresses in a linear monolithic address space, then this code makes sense. But that assumption is not supported by the language. And in fact, there was a system on which that code was intended to run (the Cray T90) on which it simply would not have worked. The T90 had 64-bit pointers pointing to 64-bit words. Byte pointers were synthesized in software by storing an offset in the 3 high-order bits of a pointer object. Subtracting two pointers in the above manner, if they both had 0 offsets, would give you the number of words, not bytes, between the addresses. And if they had non-0 offsets, it would give you meaningless garbage. (Conversion from a pointer to an integer would just copy the bits; it could have done the work to give you a meaningful byte index, but it didn't.)

The solution was simple: drop the casts and use pointer arithmetic:

long difference = p1 - p0;

Other addressing schemes are possible. For example, an address might consist of a descriptor that (perhaps indirectly) references a block of memory, plus an offset within that block.

You can assume that addresses are just numbers, that the address space is linear and monolithic, that all pointers are the same size and have the same representation, that a pointer can be safely converted to int, or to long, and back again without loss of information. And the code you write based on those assumptions will probably work on most current systems. But it's entirely possible that some future systems will again use a different memory model, and your code will break.

If you avoid making any assumptions beyond what the language actually guarantees, your code will be far more future-proof. And even leaving portability issues aside, it will probably be cleaner.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • There is still one thing I don't quite understand. "int a = 10; int *p = &a;", so p's value is a's address in mem, right? If right, then p's value will range from 0 to 2^32-1 (if cpu is 32-bit), and an integer is 4-byte on 32-bit OS, right? then What's the difference between the p's value and an integer? Can p's value go out of the range? – Alcott Sep 04 '11 at 12:46
  • 7
    People should learn to program on 16 bit DOS, where `sizeof(int)` is 2, the value of a pointer either a 16bit int or 2 16bit ints, where 4096 different pointers can point to the exact same address and where function pointer and data pointer can be of different sizes. That would teach 'em. ;-) – Patrick Schlüter Sep 04 '11 at 19:30
  • @PatrickSchlüter good lord that was a trip down nightmare alley. *Never* do I want to return to that (but it certainly reminds me daily just how good we have it now). – WhozCraig Aug 06 '16 at 04:17
5

So much insanity present here...

%p is generally the correct format specifier to use if you just want to print out a representation of the pointer. Never, ever use %d.

The length of an int and the length of a pointer (void* or otherwise) have no relationship. Most data models on i386 just happen to have 32-bit ints AND 32-bit pointers -- other platforms, including x86-64, are not the same! (This is also historically known as "all the world's a VAX syndrome".) http://en.wikipedia.org/wiki/64-bit#64-bit_data_models

If for some reason you want to hold a memory address in an integral variable, use the right types! intptr_t and uintptr_t. They're in stdint.h. See http://en.wikipedia.org/wiki/Stdint.h#Integers_wide_enough_to_hold_pointers

Nicholas Knight
  • 15,774
  • 5
  • 45
  • 57
1

Although it the vast majority of C implementations store pointers to all kinds of objects using the same representation, the C Standard does not require that all implementations do so, nor does it even provide any means by which a program which would exploit commonality of representations could test whether an implementation follows the common practice and refuse to run if an implementation doesn't.

If on some particular platform, an int* held a word address, while both char* and void* combine a word address with a word that identifies a byte within a word, passing an int* to a function that is expecting to retrieve a variadic argument of type char* or void* would result in that function trying to fetch more data from the stack (a word address plus the supplemental word) than had been pushed (just the word address). This could cause the system to malfunction in unpredictable ways.

Many compilers for commonplace platforms that use the same representation for all pointers will process an action which passes a non-void pointer precisely the same way as they would process an action which casts the pointer to void* before passing it. They thus have no reason to care about whether the pointer type that is passed as a variadic argument will precisely match the pointer type expected by the recipient. Although the Standard could have specified that such implementations which would have no reason to care about pointer types should behave as though the pointers were cast to void*, the authors of C89 Standard avoided describing anything which wouldn't be common to all conforming compilers. The Standard's terminology for a construct that 99% of implementations should process identically, but 1% would might process unpredictably, is "Undefined Behavior". Implementations may, and often should, extend the semantics of the language by specifying how they will treat such constructs, but that's a Quality of Implementation issue outside the Standard's jurisdiction.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

In C void * is an un-typed pointer. void does not mean void... it means anything. Thus casting to void * would be the same as casting to "pointer" in another language.

Using (int *)&a should work too... but the stylistic point of saying (void *) is to say -- I don't care about the type -- just that it is a pointer.

Note: It is possible for an implementation of C to cause this construct to fail and still meet the requirements of the standards. I don't know of any such implementations, but it is possible.

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • 1
    It doesn't work if `void*` and `int*` have different representations. – Keith Thompson Jan 22 '15 at 18:21
  • You mean if `void *` is a different size than `int *`? What doesn't work? – Hogan Jan 22 '15 at 18:39
  • I mean what I said: if `void*` and `int*` have different *representations*, even if they have the same size. On most modern systems, all pointers have the same size and representation -- but the C standard doesn't guarantee that. The conversion from `int*` to `void*` might be non-trivial `printf("%p", ...)` expects and argument of type `void*`; if you pass it an `int*` instead, it will not be converted, and `printf` will treat your `int*` object *as if* it were a `void*` object. [N1570](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) 7.21.6.1 paragraph 9: the behavior is undefined – Keith Thompson Jan 22 '15 at 18:51
  • I agree, I wasn't intending to say `void *` is portable or a good idea, just trying to explain (poorly) the concept. – Hogan Jan 22 '15 at 19:02
  • Your answer implies that `int a; printf("%p", &a);` is safe. It isn't. – Keith Thompson Jan 22 '15 at 19:06
  • Better, but I disagree with the way you've chosen to express the point. There's no good reason *not* to use the cast. With the cast (which costs essentially nothing), it's portable; without it, it's *not quite* portable. See my answer. – Keith Thompson Jan 22 '15 at 19:16