8

This is a common way to read the bytes of an trivially copyable object

Object obj;
auto p = reinterpret_cast<char*>(&obj);
for(size_t i = 0; i < sizeof(obj); i++)
    consume(p[i]);

The problem isn't with strict-aliasing, char* is allowed to alias anything. The problem is with this passage from [expr.add]

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined. Likewise, the expression P - J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j ≤ n; otherwise, the behavior is undefined.

Where hypothetical element refers to

A pointer past the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n] for this purpose

Which is to say, it is only legal if the arithmetic is on a pointer pointing at an array, and the result is still within its range.

However, there is clearly no char[sizeof(Object)] here, can we do arithmetic on that pointer?

Note that a legal solution to reading bytes of an object is to std::memcpy the object. But if that is the only solution, it begs to ask, why allow char* aliasing if you can barely do anything with it?

Passer By
  • 19,325
  • 6
  • 49
  • 96
  • Why is lack-of-element-one-past-the-end a concern? Surely that's true for **all** uses of pointers and arrays? – Oliver Charlesworth Dec 15 '17 at 10:37
  • `unsigned char*` arithmetic, certainly. `char*` I don't think you can. – Bathsheba Dec 15 '17 at 10:37
  • 1
    A long time ago [in a galaxy far, far away] there was no `void*` type and `char*` was used instead. – KonstantinL Dec 15 '17 at 10:40
  • @OliverCharlesworth The passage is talking about you may only do arithmetic on the elements of an array, with the one past the end treated as a hypothetical element. I'm assuming this is what you're asking about? – Passer By Dec 15 '17 at 10:44
  • @Bathsheba Why is `unsigned char*` any different? – Passer By Dec 15 '17 at 10:45
  • I guess I'm just not sure I understand what you're seeing as a problem here. Your argument ends on "*there is clearly no char[sizeof(Object)] here, can we do arithmetic on that pointer?*", but I'm not sure why you see that as a concern. – Oliver Charlesworth Dec 15 '17 at 10:47
  • Whereas it should work in practice, OP asks if it is pedantically UB or not. – Jarod42 Dec 15 '17 at 10:49
  • @OliverCharlesworth Updated to clarify – Passer By Dec 15 '17 at 10:50
  • I'm still not sure why the "past the end" thing is the crux here. By definition, even "real" arrays don't have elements past the end! (If, however, you have a general concern about whether it's valid to treat an object as a `char` array, then I get that.) – Oliver Charlesworth Dec 15 '17 at 10:52
  • It seems you have invented the problem that does not exist really. E.g. you may write `char* p = 0x1234;` and use `p` in expressions (but not for dereferencing of course). – KonstantinL Dec 15 '17 at 10:53
  • @OliverCharlesworth The "past the end" thing is just to be complete. The problem is with there being no `char[]` and we're doing arithmetic on a `char*` – Passer By Dec 15 '17 at 10:54
  • @KonstantinL `char* p = 0x1234; p += 100;` is illegal by the passage. What do you mean by _"A pointer is not coupled so tight with an [potentially] underlying object."_? – Passer By Dec 15 '17 at 10:59
  • given that we're in pedantic mode, only trivially copyable and standard layout types are guaranteed to occupy contiguous bytes of storage (and to be memcopied), so you may clarify which *Object* types your question's about ... – Massimiliano Janes Dec 15 '17 at 11:17
  • `char* p = 0x1234; p += 100;` is not illegal as no `N` of `char[N]` defined. – KonstantinL Dec 15 '17 at 11:20
  • @KonstantinL It violates the precondition _`P` points to element `x[i]` of an array object_ – Passer By Dec 15 '17 at 11:21
  • Yep, _precondition_, not the rule. – KonstantinL Dec 15 '17 at 11:22
  • @geza That question is about fetching a value form pointer, while this question is about just pointer arithmetic. – ivaigult Dec 15 '17 at 12:10
  • @ivaigult: no, that question is about arithmetic as well. – geza Dec 15 '17 at 12:11
  • @geza Yeah that's a dupe – Passer By Dec 15 '17 at 12:39

1 Answers1

1

The pointer arithmetic should be legal according to the quotes. An Object instance obj can be viewed as char[sizeof(Object)]. So, it is an array of n elements (note that n is sizeof(Object)). Standard allows to do pointer arithmetic in bounds of this array plus one hypothetical element beyond the bounds of this array. This is due to less or equal sign in the

0 ≤ i + j ≤ n

expression.

Literally, reinterpret_cast<char*> (&obj) + sizeof(Object) is fine because it points to hypothetical element a[j], where j = sizeof(Object) and it is less or equal than size of the array, which is sizeof(Object).

So, the answer is yes.

Otherwise std::end for arrays would be UB.

ivaigult
  • 6,198
  • 5
  • 38
  • 66
  • 2
    Where exactly does it say the `obj` can be "viewed" as `char[sizeof(Object)]`? – Passer By Dec 15 '17 at 12:36
  • @Passer By by viewed aliased was meant. – ivaigult Dec 15 '17 at 12:41
  • __Where__ exactly is it said that that is allowed? – Passer By Dec 15 '17 at 12:42
  • @PasserBy [basic.types](https://timsong-cpp.github.io/cppwp/basic.types#def:object_representation) – ivaigult Dec 15 '17 at 12:45
  • `N` `unsigned char` is not equal to `unsigned char[N]`, even if they are right beside each other. Also, you should put that in the answer. – Passer By Dec 15 '17 at 12:46
  • There is also another paragraph _"For any object (other than a base-class subobject) of trivially copyable type `T`, whether or not the object holds a valid value of type `T`, the underlying bytes making up the object can be copied into an array of `char`, `unsigned char`, or `std​::​byte`"_. Which made no mention of directly reading the object's value representation as if it was an array. – Passer By Dec 15 '17 at 12:50