3

How to access object representation? To answer this question I divide it in 2 questions:

1. How to get a pointer to object representation?

According to the standard I can not see any way to get a pointer to an object-representation. It is often proposed to get it this way:

some_type obj{};
const char * rep = reinterpret_cast<const unsigned char*>(&obj);

Nevertheless, it is not said in the standard that an object and its object-representation are pointer-interconvertible. Why is this code allowed by the standard?

2. Can we consider that the object-representation is initialized when the object is initialized?

some_type obj{};
const char * rep = reinterpret_cast<const unsigned char*>(&obj);
char x = rep[0] + rep[1];

Here obj is default initialized. How the compiler interpret rep[0], is it an indeterminate-value, or may be it depends on what bytes of memory have been initialized during obj initialization?

Community
  • 1
  • 1
Oliv
  • 17,610
  • 1
  • 29
  • 72
  • I'd say this is _undefined behavior_, or at least _indeterminate_. – user0042 Oct 13 '17 at 17:44
  • @Ron I am realy speaking to poepple who have read the standard. – Oliv Oct 13 '17 at 17:46
  • 2
    _@Oliv_ I'm pretty sure @ron is familiar with the imporrtant parts of it. – user0042 Oct 13 '17 at 17:48
  • @user0042 I do not know how you can say that? But what I can say you, I know the standard almost by heart! – Oliv Oct 13 '17 at 17:49
  • 1
    The "object representation" is an implementation defined binary format that is fairly meaningless when looked at as an array of characters. – Galik Oct 13 '17 at 17:50
  • 1
    This is not the definition of object-representation: "he object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T , where N equals sizeof(T) " [basic.types] – Oliv Oct 13 '17 at 17:51
  • 1
    Here is the [object representation](http://eel.is/c++draft/basic.types#4) definition in the standard. – Ron Oct 13 '17 at 17:52
  • 1
    @Ron I do know what I am talking about. – Oliv Oct 13 '17 at 17:52
  • 1
    The reason the standard explicitly allows this is so that we can do low level stuff like sending information over a network or storing numbers on disk. But you have to follow aliasing rules to get it right. – Galik Oct 13 '17 at 17:54
  • @Ron So now read *pointer-interconvertible* in [basic.compound] then [expr.static.cast] then [dcl.init]. – Oliv Oct 13 '17 at 17:56
  • @Oliv I think you meant "_now_". Will do. – Ron Oct 13 '17 at 17:56
  • @Galik I know the use case. What I want is something that certifies that compilers are not going to break it in the near future. – Oliv Oct 13 '17 at 17:57
  • @Ron now, + you should know – Oliv Oct 13 '17 at 17:57
  • The standard explicitly states that all pointer types are convertable to `char*`, `unsigned char*` and `std::byte*`. – Galik Oct 13 '17 at 17:58
  • @Galik, this is said indeed that using reinterpret_cast the pointer value will not be changed. That does not mean that the pointer will point to the object representation. Read carefully [expr.static.cast] – Oliv Oct 13 '17 at 18:00
  • @Galik It is also said that we can perform this before or after object life-time but this is out of scope. – Oliv Oct 13 '17 at 18:01
  • Not sure why people want to close this question! – curiousguy Oct 17 '17 at 23:39
  • 1
    @curiousguy 3 poeple voted to close this question in the 5 minutes that have followed its post: a guy with high reputation, who had overestimated its knowledge, that has since erased its erroneous comment. Then certainly 2 others have followed him without thinking. It took me hard time to explain how this subject was more complex than what poeple without deep knowledge of the standard could think about it. Unfortunatly, unconsciousness of what we don't know is the common of humanity. – Oliv Oct 18 '17 at 06:45
  • 1
    This problem is addressed by [P1839](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1839r1.pdf). – xskxzr Nov 20 '19 at 11:37
  • @xskxzr In the second paragraph of §7.1 of P1839 "The sequence is considered to be an array of N T..." That is a typo no? Should not it be "The sequence is considered to be an array of N char/unsigned char/std::byte" – Oliv Nov 21 '19 at 19:15

1 Answers1

2

1) Your approach works:

Working with const pointers ensure that constness is not casted away:

5.2.10/2 The reinterpret_cast operator shall not cast away constness.

The pointer conversion is safe, because char has not a stricter alignment requirement than some_type, so that you may convert rep back to a some_type*:

5.2.10/7 An object pointer can be explicitly converted to an object pointer of a different type. (...) Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value.

Edit: In my understanding, there is no doubt about inter-convertibility between the pointer to an object and the pointer to its representation:

1.8/6: Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies.

3.9/4: The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T).

I understand that "taken up" is a synonym of "occupies". Note also that, the & operator guarantees that:

5.3.1/3: (...) if the type of the expression is T, the result has type “pointer to T” and is a prvalue that is the address of the designated object

2) The object representation is initialized with the object:

This is induced from the definition of the value representation, taken together with the memory model and the object lifecylcle.

However, your example is more complex:

  • rep[0] may despite this property remain an undetermined value, if it is composed solely of padding bits. This is the case in your example, because the object has at least a size of 1, but as you have no member in it, the value representation is empty.
  • rep[1] can be undefined behavior, if sizeof(some_type)<2 because dereferencing a pointer passed the last element of an array is UB.

3) What is the object representation (in plain language) ?

Let's take a simple example:

class some_other_type {
    int a;
    std::string s;
};

There is an ambiguity when speaking about the memory occupied by an object:

  • is it only the fixed size contiguous memory corresponding to the its type (i.e. an int, some size_t for the string's length and some pointer to the chars in the string, like it would be done in C) ?
  • or is it all the values stored in memory for the object, including at some values stored in memory places allocated somewhere else (e.g. also the bytes required to store the value of our string) ?

The object representation corresponds to the first part. For objects that are not trivially copiable, the object representation is not self sufficient (i.e. in our example, the bytes stored in the string are not necessarily part of the object representation).

The value representation corresponds to the second part (and would include the bytes required to store the value of the string).

In plain words, this means that the address of an object is the address of its representation, but the object representation may contain padding and may not be sufficient to hold every data that belongs to the object.

Christophe
  • 68,716
  • 7
  • 72
  • 138
  • A pointer value can be "a pointer to an object", "a pointer past the end of", "null pointer value", "invalid pointer value". [basic.compound] My problem is that the standard specifies that the pointer value is a *pointer to object* only if the target and source type of the reinterpret_cast are pointer convertible. Nevertheless, the object-representation and its object are not pointer-interconvertible. read [basic.compound] – Oliv Oct 13 '17 at 18:39
  • Let's say there is a hole in the standard and that 1) is OK. Have you an idea for 2)? – Oliv Oct 13 '17 at 18:43
  • On 1) interesting remark indeed. This shows that your question would really deserve more up-votes. On 2) I have completed my answer. – Christophe Oct 13 '17 at 18:46
  • Thank you Christophe, I have poste a new question, for which I'll appreciate your expertise! https://stackoverflow.com/questions/46738487/can-one-get-a-pointer-to-a-complete-object-representation-element-from-a-pointer – Oliv Oct 13 '17 at 21:51
  • @Christophe I have been searching for the reason why the standard includes the following sentence: "For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values" (the same question as https://stackoverflow.com/q/12773640). Based on your answer, the `value representation` of a non-trivially-copyable object may contain bits that are not inside the `object representation`. Is this the reason why the committee introduced this sentence? Thank you. – user42768 Mar 18 '19 at 12:30
  • @user42768 yes, that’s it. I suppose it’s the reason behind the wording by the standard committee. – Christophe Mar 18 '19 at 12:42
  • @Christophe Thank you for your quick answer. For a base class subobject that has virtual methods, would you say that the most-derived object is part of the base class subobject's `value representation`? As, conceptually, the base class subobject would behave differently (would have a different value) than if it were itself a complete object? Also, would its vtable be also part of its `value representation`? – user42768 Mar 18 '19 at 12:50
  • @user42768 that’s an excellent question on its own. It would desserve a full answer, not just a small comment. Just tell me the link so that I don’t miss it :-) – Christophe Mar 18 '19 at 12:55
  • @Christophe Should I make a new question based on the questions in my previous comment and link it here? – user42768 Mar 18 '19 at 12:57
  • @user42768 yes! the best way would be to ask it as a new question, link to this and the previous question to show that you have already done some research. may be you should also put the language-lawyer tag – Christophe Mar 18 '19 at 14:15
  • @Christophe I posted a new question here: https://stackoverflow.com/q/55225535/3766405. Again, thank you for your help. – user42768 Mar 18 '19 at 16:07
  • _there is no doubt about inter-convertibility between the pointer to an object and the pointer to its representation_ You are very self-confident. – Language Lawyer Feb 12 '21 at 10:34
  • @LanguageLawyer thank you for your in-depth review. Could you be more specific about clues in the standard that weaken my confidence on that matter? – Christophe Feb 12 '21 at 12:54
  • In C++17 (and above), there is definition of [pointer-interconvertible](https://timsong-cpp.github.io/cppwp/n4659/basic.compound#def:pointer-interconvertible) which enumerates the cases when it takes place. I don't see it including "pointer to object representation". The author of the question writes about this: _it is not said in the standard that an object and its object-representation are pointer-interconvertible_ – Language Lawyer Feb 12 '21 at 12:58
  • @LanguageLawyer Thank you. Very interesting. I’ll definitively have a look at this. But at the time I wrote this answer, C++17 was not officially published as standard (draft in March, published standard in December, my answer in October) and I probably based my claim on the text of C++14. (I’ll check if the same wording was already used earlier) – Christophe Feb 12 '21 at 15:06