33

The working draft of the standard N4659 says:

[basic.compound]
If two objects are pointer-interconvertible, then they have the same address

and then notes that

An array object and its first element are not pointer-interconvertible, even though they have the same address

What is the rationale for making an array object and its first element non-pointer-interconvertible? More generally, what is the rationale for distinguishing the notion of pointer-interconvertibility from the notion of having the same address? Isn't there a contradiction in there somewhere?

It would appear that given this sequence of statements

int a[10];

void* p1 = static_cast<void*>(&a[0]);
void* p2 = static_cast<void*>(&a);

int* i1 = static_cast<int*>(p1);
int* i2 = static_cast<int*>(p2);

we have p1 == p2, however, i1 is well defined and using i2 would result in UB.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Could you link to the relevant draft please? n4296 (which is the draft I have bookmarked) doesn't include "pointer-interconvertible". – Martin Bonner supports Monica Dec 21 '17 at 11:40
  • 2
    @MartinBonner [This](https://timsong-cpp.github.io/cppwp/n4659/basic.compound#4) and [this](https://timsong-cpp.github.io/cppwp/basic.compound#4) – Passer By Dec 21 '17 at 11:43
  • 2
    @Someprogrammerdude An array is not a pointer, but nor is the first element of an array a pointer (in general). I *guess* that "pointer-interconvertible" is about standardizing when you can cast between base and derived pointers through static casts to `void*` and back (and when you can't). – Martin Bonner supports Monica Dec 21 '17 at 11:43
  • Relevant: https://stackoverflow.com/questions/47653305/is-there-a-semantic-difference-between-the-return-value-of-placement-new-and-t – Passer By Dec 21 '17 at 11:45
  • @Someprogrammerdude an pointer to an array represents the address of that array. A pointer to the first element of said array represents the address of the first element. The two pointers represent the same address, but they are not convertible to each other. – n. m. could be an AI Dec 21 '17 at 11:46
  • @MartinBonner done. – n. m. could be an AI Dec 21 '17 at 11:52
  • I think there is little benefit to make such codes defined, and the less the rules are, the happier the compiler/optimizer will be. – xskxzr Dec 21 '17 at 13:00
  • @xskxzr Why define it for the first member of a struct then? What's the practical difference between that and the first element of an array? – n. m. could be an AI Dec 21 '17 at 13:22
  • 1
    I find this paragraph about static_cast(void*) also obscurantist [expr.static.cast]: ". Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible (6.9.2) with a, the result is a pointer to b. **Otherwise, the pointer value is unchanged by the conversion.** ". Is the pointer value a valid pointer value if it points to the right address? – Oliv Dec 21 '17 at 13:26
  • Given the first member of a struct, we can use the cast to access its enclosing struct, thus access other members. But this is unnecessary for an array element. We can just take its address and do pointer arithmetic to access other elements. – xskxzr Dec 21 '17 at 13:33
  • 2
    If you read the comments below |this question](https://stackoverflow.com/questions/47616508/what-is-the-rationale-for-limitations-on-pointer-arithmetic-or-comparison) you will read that the C++ memory model has, and is still, mostly influenced by Boehm, who sell a garbage collector library. Since I read this comment, I suspect that inconsistencies in the C++ memory model result from the influence of its interest and not for rational reasons. – Oliv Dec 21 '17 at 13:38
  • 1
    @xskxzr What if the array is the first member of a struct and we have a pointer to the first element of the array and want to access that struct? – n. m. could be an AI Dec 21 '17 at 13:41
  • I think this is rare in practice... This is the reason why I say "little benefit" rather than "no benefit". – xskxzr Dec 21 '17 at 13:44
  • 1
    Maybe the answer is that standard as code, after having been modified a few time by many different poeple, finish to look like a soap where nobody know anymore the rational behind this floating maggot! – Oliv Dec 21 '17 at 20:08
  • 1
    @Oliv [Sells, you say](http://www.hboehm.info/gc/license.txt)? – T.C. Dec 22 '17 at 09:07
  • @T.C. Sorry, I do not associate the idea of "selling" to "money" since I have worked as a researcher in a public research center! I associate it to the concept of value. For example, there is this (almost iso) morphism money/{material,services,etc...}, impact-factor/{researcher,post-doc,phd student,...}, manager-usefulness-perception/employees and so on. No matter the dimension on which is evaluated the value. I suppose you are close to, or a commitee member? Questions as this one are recurring. They never get a good answer. Is the commitee still working on the object/memory model? – Oliv Dec 22 '17 at 09:44
  • "The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values" [basic.types]/4 so at the end of the day, none of that matters. – curiousguy Jun 07 '18 at 01:50

4 Answers4

28

There are apparently existing implementations that optimize based on this. Consider:

struct A {
    double x[4];
    int n;
};

void g(double* p);

int f() {
    A a { {}, 42 };
    g(&a.x[1]);
    return a.n; // optimized to return 42;
                // valid only if you can't validly obtain &a.n from &a.x[1]
}

Given p = &a.x[1];, g might attempt to obtain access to a.n by reinterpret_cast<A*>(reinterpret_cast<double(*)[4]>(p - 1))->n. If the inner cast successfully yielded a pointer to a.x, then the outer cast will yield a pointer to a, giving the class member access defined behavior and thus outlawing the optimization.

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • They optimize even without `restrict` on `g`'s argument? Oh, `restrict` isn't in C++ except as compiler extensions. Nevermind... – jxh Dec 21 '17 at 23:51
  • Would this still hold if `x` were `int[4]` ? – M.M Dec 22 '17 at 07:26
  • @M.M I'm not currently seeing why not. – T.C. Dec 22 '17 at 09:10
  • 1
    @T.C. I guess it optimises because the standard gives an explicit license to optimise, and not the other way around. It won't be able to optimise if x wasn't an array. My question is, why the standard draws a line betwenn arrays and non-atrays? It seems completely arbitrary. – n. m. could be an AI Dec 22 '17 at 10:01
  • @n.m. [The optimizer was there first](https://groups.google.com/a/isocpp.org/d/msg/std-proposals/gN-_7CJ58G4/4JG3i4S25z8J), and the wording written to (partially) accommodate it. – T.C. Dec 22 '17 at 11:04
  • 3
    Interesting. What compiler does this? I didn't find any on godbolt.org. – n. m. could be an AI Dec 23 '17 at 07:25
  • What does the inner cast actually yield? N4659 8.2.9/13 does define the behaviour of `reinterpret_cast` when there is NOT a pointer-interconvertible object at the location. The definition is "the pointer value is unchanged by the conversion", and surely the only possible meaning of that is that the result of the cast does point to the same byte in memory that the cast's argument pointed to – M.M Jan 25 '18 at 22:31
  • To forestall any strict aliasing argument about the double array; imagine `g` did `((A *)((char *)p - sizeof(double)))->n` . What is the result of the cast to `A *` ? – M.M Jan 25 '18 at 22:34
  • 3
    @M.M Just because it represents the same address ("points to the same byte") doesn't mean that it has the same pointer value in the abstract machine. The result of the inner cast is a pointer of type "pointer to array of 4 `double`" with the value "pointer to the first element of `a.x`" and therefore the result of the outer cast is a pointer of type "pointer to `A`" with the value "pointer to the first element of `a.x`", and since it does not actually point to an `A` object, has undefined behavior when the class member access expression is used to access a non-static data member of `A`. – T.C. Jan 25 '18 at 23:12
  • 2
  • Are you saying it is no longer well-defined to inspect any object (other than character arrays) by iterating over it with `unsigned char *` ? – M.M Jan 25 '18 at 23:49
  • 1
    @M.M See [core issue 1701](https://wg21.link/CWG1701). This part has never been properly specified in the standard, so it's not really meaningful to evaluate how it would work in the cleaned-up pointer model. When it is eventually specified, presumably the range of permissible pointer arithmetic would need to be limited to the memory reachable through the original pointer (just like `launder`) to permit the optimization at issue. – T.C. Jan 26 '18 at 00:03
  • 3
    @T.C. in light of your last comment, would it be right to say that the core rationale (which this question is asking about) is so that pointers to array elements can't "escape" their array. Compare with `void h(double(*)[4]); h(&a.x);` - presumably the optimization is no longer possible, since `h` might cast its argument to `A *`; with that cast being correct because pointer to standard-layout struct is interconvertible with pointer to its first element. – M.M Jan 26 '18 at 00:21
  • 1
    @T.C. "_In C++, malloc has never created an object because [intro.object]/1 says that an object is *only* created by [list does not mention union access]._" Are these guys seriously saying that changing the active member of a union never created an object? Was any use of a union illegal in C++? Or rather, can we take that std text as a joke? – curiousguy Jun 07 '18 at 02:37
  • This example doesn't match the OP. &a.x[1] is not the first member of a.x, thus it's not relevant. The compiler could do the optimization here even if a.x[0] was pointer-interconvertible with a.x. – Luke Aug 22 '23 at 23:49
  • @Luke You (and `g`) can trivially obtain `&a.x[0]` when given `&a.x[1]`. That's just pointer arithmetic. – T.C. Aug 23 '23 at 04:53
  • @T.C. Yes, but it is needlessly confusing given the context of the OP, in my opinion. What is it about x[1] that makes this a better answer than x[0]? – Luke Aug 24 '23 at 15:16
3

More generally, what is the rationale for distinguishing the notion of pointer-interconvertibility from the notion of having the same address?

It is hard if not impossible to answer why certain decisions are made by the standard, but this is my take.

Logically, pointers points to objects, not addresses. Addresses are the value representations of pointers. The distinction is particularly important when reusing the space of an object containing const members

struct S {
    const int i;
};

S s = {42};
auto ps = &s;
new (ps) S{420};
foo(ps->i);  // UB, requires std::launder

That a pointer with the same value representation can be used as if it were the same pointer should be thought of as the special case instead of the other way round.

Practically, the standard tries to place as little restriction as possible on implementations. Pointer-interconvertibility is the condition that pointers may be reinterpret_cast and yield the correct result. Seeing as how reinterpret_cast is meant to be compiled into nothing, it also means the pointers share the same value representation. Since that places more restrictions on implementations, the condition won't be given without compelling reasons.

Passer By
  • 19,325
  • 6
  • 49
  • 96
  • 1
    I don't quite see how this is relevant to the question in hand. When two different objects reside at the same address at different times, using a pointer to one as if it was a pointer to another conflicts with some reasonable optimisations. But we have an array and its element, i.e. an object and its subobject that sits at the beginning of the object. Why does this work for a struct and its first member, but not for an array and its first element? What is the conceptual difference here? What difficilties would ensue if it was allowed also for arrays? – n. m. could be an AI Dec 21 '17 at 16:02
  • @n.m. My point is there might be nothing fundamental. Pointer interconvertibility is given only when compelling reasons arise since it is both logically weird (breaks abstraction), and it might restrict implementations. It isn't that there is reason not to, it's because there isn't reason to. – Passer By Dec 21 '17 at 16:06
  • 2
    I dom't see how the reasoning that applies to struct subobjects doesn't also apply to array subobjects for purposes of pointer convertibility. There mst be something that applies to one and not the other but I don't see what it is. – n. m. could be an AI Dec 21 '17 at 17:06
  • @n.m. Well, C supports struct subobjects, and legacy C code in C++ assumes it works. – Yakk - Adam Nevraumont Dec 21 '17 at 17:53
  • @n.m. There is also a similar limitation for the *common initialization sequence* that seems unexplainable (the concept used to allow or not to read a value within a non active member of a union). That applies to struct but not to arrays! – Oliv Dec 21 '17 at 19:06
  • @Yakk I can't find anything similar in the C standard, can you quote chapter and verse? – n. m. could be an AI Dec 21 '17 at 20:14
  • 4
    It'd be nice if there were a rationale document that explained why all these rules exist – M.M Dec 22 '17 at 07:24
  • "_Logically, pointers points to objects, not addresses_" That contradicts "pointers have trivial type", but OK – curiousguy Jul 27 '18 at 03:34
  • "_logically weird (breaks abstraction),_" C/C++ is "high level assembly", there is no "abstraction" – curiousguy Jul 28 '18 at 04:50
  • @curiousguy Quite untrue, `bool f(int x) { return x + 1 > x; }` gets constant folded to `true`. – Passer By Jul 28 '18 at 07:03
  • C/C++ don't give you access to signed 2 complement operations. Well, unless you use volatile. – curiousguy Jul 28 '18 at 07:10
  • @n.1.8e9-where's-my-sharem. Re "I don't see how the reasoning that applies to struct subobjects doesn't also apply to array subobjects for purposes of pointer convertibility." That is the core issue, and after reading the discussion in https://groups.google.com/a/isocpp.org/g/std-proposals/c/gN-_7CJ58G4/m/4JG3i4S25z8J, I think the committee originally **didn't want to allow either one.** It needed pointing out that "Without support for this, **struct sockaddr does not work**" for them to allow interconvertability with the fist class member. So now we have this inconsistency. – Peter - Reinstate Monica Nov 14 '21 at 16:47
2

Because the comittee wants to make clear that an array is a low level concept an not a first class object: you cannot return an array nor assign to it for example. Pointer-interconvertibility is meant to be a concept between objects of same level: only standard layout classes or unions.

The concept is seldom used in the whole draft: in [expr.static.cast] where it appears as a special case, in [class.mem] where a note says that for standard layout classes, pointers an object and its first subobject are interconvertible, in [class.union] where pointers to the union and its non static data members are also declared interconvertible and in [ptr.launder].

That last occurence separates 2 use cases: either pointers are interconvertible, or one element is an array. This is stated in a remark and not in a note like it is in [basic.compound], so it makes it more clear that pointer-interconvertibility willingly does not concern arrays.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • "allows an implementation to have different representions for object pointers and array pointers" Why a pointer to an array can be converted to a pointer to its struct *super* object then? (If it's the first member of course) – n. m. could be an AI Dec 21 '17 at 17:12
  • @n.m.: After some reflection, I think that a pointer to an array cannot be convertible to a pointer to its super object because an array is not a standard layout object. My opinion is that the first element of the array is pointer convertible to the super object but none to the array. Of course when converted to a byte pointer (char pointer before C++17) all 3 will be converted to the same pointer because the first byte of an array is the first byte of its first element. A pointer to an array can then be converted to a pointer to its first element (via a char* or reinterpret_cast), ... – Serge Ballesta Dec 21 '17 at 19:24
  • ... but nothing guarantees that the pointer to the array and the pointers to the objects (first element and super object) have same representation. A pointer to an object can be static_casted to a pointer to a subclass and back, but the subclass object can be at a different address. – Serge Ballesta Dec 21 '17 at 19:27
  • An array and its first element have **the same address**. It's guaranteed by the standard. No one is talking about a subclass object of the first element of the array. Only about the first element itself. There is a guarantee that an array is convertible to its standard layout **super** object, so either these pointers must have the same representation, or the difference in the representation doesn't matter. – n. m. could be an AI Dec 21 '17 at 19:44
  • 1
    Furthermore, `struct {int i;}` and its first element are poiner-convertible even though corresponding pointers are very explicitly allowed to have different representation. – n. m. could be an AI Dec 21 '17 at 20:19
  • @n.m.: I must acknowledge that I have no evidence for it, so it is only an *opinion*. As such even if I have really appreciated the discussion in comments it has nothing to do in a answer. Edited. Many thanks for the feedback.. – Serge Ballesta Dec 22 '17 at 07:16
1

Having read this section of Standard closely, I have the understanding that two objects are pointer-interconvertible, as the name suggests, if

  1. They are “interconnected”, through their class definition (note that pointer interconvertible concept is defined for a class object and its first non-static data member).

  2. They point to the same address. But, because their types are different, we need to “convert” their pointers' types, using reinterpret_cast operator.

For an array object, mentioned in the question, the array and its first element have no interconnectivity in terms of class definition and also we don’t need to convert their pointer types to be able to work with them. They just point to the same address.

F14
  • 67
  • 2
  • 6