8

Question: If pointers comparing equals are their integer-converted values also equal?

For example:

void *ptr1 = //...
void *ptr2 = //...
printf("%d", ptr1 == ptr2); //prints 1

Does it mean that (intptr_t) ptr1 == (intptr_t) ptr2 is also 1?

From pragmatic point of view that should be right. But considering what the Standard specifies at 7.20.1.4(p1):

The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:

    intptr_t

it does not contradict to that an implementation can convert the same pointers to different values (depending on some weird circumstances), preserving that the values converted back yields the same pointers.

So, I think no, the integer-converted values of pointers comparing equal are not necessary equal to each other.

Some Name
  • 8,555
  • 5
  • 27
  • 77
  • 2
    I can't imagine how casting a value in the same way twice could ever give a different result. But I can't say I have proof of that anywhere. – yhyrcanus May 20 '19 at 21:21
  • 3
    It seems like this would be an obvious no-op that the compiler would disregard, but just speculation – Rorschach May 20 '19 at 21:22
  • @yhyrcanus I ran a few experiments and the integers indeed had the same value, but I wanted to make sure if it is conforming to count on that. – Some Name May 20 '19 at 21:26
  • 4
    In a segmented memory architecture like an Intel 80286, it's possible for two pointers to the same object to have different values. – Lee Daniel Crocker May 20 '19 at 22:05

3 Answers3

8

Your analysis is correct. Other than allowing conversions to and from integers at §6.3.2.3, the standard doesn't mention how that conversion should behave. Granted, there is a "round trip" requirement on intptr_t, but it doesn't prevent more than a single trip being possible, with the compiler choosing one or another based on some constraint or requirement.

So indeed, the C standard doesn't require (intptr_t) ptr1 == (intptr_t) ptr2 to hold.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • 1
    I suppose we cannot even imply that if `(intptr_t) ptr == 0` then `ptr == NULL` even in spite of the the fact that the definition of null pointer constant being an integer constant expression with value `0`. – Some Name May 20 '19 at 21:32
  • 2
  • Not really related to this question, but anyway I think it is necessary to explicitly convert any pointer type different from `void *` to `void *` explicitly before converting to `intptr_t`, isn't? – Some Name May 20 '19 at 22:53
  • 1
    It depends on what you mean by "necessary", @SomeName. It is to `void *` specifically that the round-tripping guarantee applies. Conversions of other pointer types to `intptr_t` have implementation-defined behavior, or possibly even undefined behavior, but they are *allowed*, so converting to `void *` first is not required in that sense. In practice, however, as a quality of implementation matter, you can usually rely on round-tripping any object pointer type through `intptr_t`. – John Bollinger May 20 '19 at 23:33
  • @JohnBollinger I meant that if an implementation supports `intptr_t` than round-tripping pointers to object types is possible as `obj_t * --> void * --> intptr_t --> void * --> obj_t *` without any additional implementation-defined assumptions. – Some Name May 21 '19 at 00:58
  • @SomeName [It is indeed necessary](https://stackoverflow.com/q/45352226) to convert through `void *`. – Alex Shpilkin May 24 '21 at 23:51
7

In almost all implementations, two pointers are equal if and only if their representations are equal, but the standard doesn't guarantee that.

The fact that ptr1 == ptr2 doesn't imply that ptr1 and ptr2 have the same representation. N1570 6.5.9 paragraph 6:

Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

For example, suppose a pointer is represented as a two-part entity, with the first part identifying a segment of memory, and the second part a byte offset within that segment. If two segments can overlap, then there can be two different pointer representations for the same memory address. The two pointers would compare as equal (and the generated code would likely have to do some extra work to make that happen), but if conversion to intptr_t just copies the representation then (intptr_t)ptr1 != (intptr_t)ptr2.

(It's also possible that the pointer-to-integer conversion could normalize the representation.)

This possibility is why == and != are well defined for pointers to different objects, but the relational operators (<, <=, >, >=) are undefined. The equality operators have to determine whether the two pointers point to the same location, but the relational operators are allowed to compare only the offsets and ignore the base portion (assuming that each object is in a single segment). In practice, almost all modern systems have a monolithic address space, and the equality and relational operators work consistently even though the standard doesn't require them to do so.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • *"or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space"* -- whoa, this is legal? That's very surprising. – S.S. Anne Aug 24 '19 at 16:36
  • 2
    @JL2210: It's legal because it would be impractical to disallow it. Given `int x; int y; int *p0 = &x + 1; int *p1 = &y; if (p0 == p1) ...`, if `y` happens to immediately follow `x` in memory, then `p0` and p1` are going to point to the same memory location. Making `p0 == p1` yield a false result would require unreasonable extra work. – Keith Thompson Aug 24 '19 at 19:29
6

An implementation in which the size of a pointer is between that of two integer types (e.g. segmented-mode 80386, where pointers were 48 bits) might process something like:

uintptr_t my_uintptr = (uintptr_t)myptr;

by storing myptr into the first 48 bits of my_uintptr and leaving the remaining bits holding arbitrary values, provided that a later conversion myptr = (void*)my_uintptr; ignores the value of those bits.

Since there's no guarantee that repeated conversions of the same pointer to uintptr_t will yield the same value, there's likewise no guarantee in the case where the pointers being converted compare equal despite having been produced by different means.

If, however, an implementation documents the storage formats for pointers and integers, and documents how conversions are performed, and if there is no way that a behavior could behave in a fashion consistent with that documentation without upholding stronger semantic guarantees, then the implementation should be expected to uphold such guarantees. I don't think the Standard requires that implementations behave in a fashion consistent with their documentation as a condition of conformance, but the notion that quality implementations should be expected to behave as documented should be sufficiently self-evident that the Standard shouldn't need to require it.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 2
    This is a *great* non-tortured example of where such an assumption might fall down. – Ben Zotto May 20 '19 at 21:42
  • @BenZotto: Hardware platforms where there would be any reason for the assumption to fail are rare. Implementations which describe how things are stored, but then behave in a fashion inconsistent with that, are in practice a bigger issue. Personally, I would think anyone who is making a bona fide effort to produce a quality implementation would seek to avoid such inconsistencies without regard for whether the Standard forbids them, but not everyone feels that way. – supercat May 20 '19 at 22:22
  • Oh, absolutely, and I agree with you 100%. I appreciated that you made that point in your answer, too. But I was moved to the comment simply because in reading the question it seemed an extremely silly language-lawyer hair to split and you neatly provided a real example of a theoretical pitfall with it. (In practice, of course an implementation would almost certainly have to go out of its way to be wacky like that!) – Ben Zotto May 20 '19 at 22:45
  • 2
    @BenZotto: I don't know if any 80386 compilers that used 48-bit pointers supported 64-bit integer types, or if they simply didn't define uintptr_t, since by the time C99 added 64-bit types, almost everything on x86 used "flat" 32-bit mode with a single segment for everything. Further, I would think a quality compiler for that mode should offer an option to explicitly clear the upper bits of a uintptr_t without regard for whether the Standard requires it, even if it also offers an option to leave such bits indeterminate. – supercat May 20 '19 at 22:53
  • @AlexShpilkin: The Standard doesn't "disallow" any such thing except in *strictly conforming* programs. The question of whether to support such constructs in non-portable programs is left as a "quality of implementation" issue outside its jurisdiction. What's unfortunate is that the authors of clang and gcc refuse to acknowledge that the Standard makes no attempt to forbid low quality implementations, and thus conformance is not in and of itself a measure of quality. – supercat May 25 '21 at 00:07