Is initializing a pointer with an arbitrary, literal, non-zero value defined?

Question

Consider:

struct T{};

int main() {
    T* p = (T*)0xDEADBEEF;
}

Using an invalid pointer is implementation-defined. Dereferencing it is undefined behavior. My question isn't about those.

My question is whether the mere initialization of p, as is, is defined.

If you think you already have all the information needed to answer this question (or if you found out that this is a duplicate), you need read no further. Following is some mumbling based on my findings:

The C standard (of which the C++ standard is based on) says:

6.3.2.3 Pointers

5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

Which hints it may be implementation defined.

The C++ Standard only defines (as far as I know) that any use of an invalid pointer value is implementation defined. The footnote is of special importance, as it seems to suggest that the mere copy of such value is already an use of the pointer. (Or does it mean the pointed-to value? I'm confused)

6.7 Storage duration

4 When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values (6.9.2). Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.³⁷

_{37 Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault}

This kind of agrees with the C standard. The problem is I'm not convinced this is an example of an invalid pointer value, as the standard clearly says this type of value is caused by the end of the duration of storage for that address being reached (which clearly never happened).

There are also many instances of pointer arithmetic that are Undefined Behaviour^TM, but clearly no arithmetic or manipulation on the value of the pointer is being done here. This is merely an initialization.

Possible duplicate of [Is initializing a pointer declarator with an invalid pointer undefined behavior?](https://stackoverflow.com/questions/50546512/is-initializing-a-pointer-declarator-with-an-invalid-pointer-undefined-behavior) — Richard Critten, May 27 '18 at 01:51
@RichardCritten Although my question is really close to that one (It actually arised from a discussion with the OP on that question!), there's a slight difference: That question already establishes that the value being assigned/initialized from is an *invalid pointer value*, as per the wording on the standard. The **literal** being used on this question is quite an important detail, and it is not clear (at least for me) whether it should be treated the same way. — Not a real meerkat, May 27 '18 at 02:15
Also, the answer on that question states that the **lvalue-to-rvalue** conversion constitutes the use of the pointer (which I agree). In this question, however, there's no such conversion. — Not a real meerkat, May 27 '18 at 02:19
Copying an invalid pointer is implementation defined, but in ISO standards notes are not normative. They're on a par with examples. Sometimes wrong. — Cheers and hth. - Alf, May 27 '18 at 02:22
Are you exploring this from a purely theoretical/language lawyer point of view? Are you concerned about it in an application use case? — R Sahu, May 27 '18 at 02:47
@RSahu Purely in a theoretical point of view. I don't have specific use-cases for this. — Not a real meerkat, May 27 '18 at 03:13
The equivalent of that C standard paragraph would be [**\[expr.reinterpret.cast/5\]**](http://eel.is/c++draft/expr.reinterpret.cast#5): "A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined." I *think* the initialization you show is implementation-defined. — Igor Tandetnik, May 27 '18 at 03:13
The result of doing the initialisation (e.g. the value that the pointer will contain) is implementation defined, as is the result of obtaining that value (including when copying its value, or comparing it with other values or variables). Dereferencing it gives undefined behaviour. — Peter, May 27 '18 at 03:28
The text of the C standard is entirely irrelevant to this; Would recommend deleting it from your question. — M.M, May 27 '18 at 04:51
"Using an invalid pointer in any way is implementation-defined" is not true either. — M.M, May 27 '18 at 04:57
@RichardCritten IMO that is not a duplicate: initializating from a casted integer constant is substantially different to initializing from a freed but formerly-valid value — M.M, May 27 '18 at 10:30
@M.M Barring guaranteed copy elision, I don't see how they are different. They are both invalid pointers with their value being used. — Passer By, May 27 '18 at 11:38
@M.M ""Using an invalid pointer in any way is implementation-defined" is not true either" - Better? — Not a real meerkat, May 27 '18 at 13:05
@M.M About the text of the C Standard: I disagree. Surely it could be irrelevant to an answer, but on the question it shows my rationale (That may not necessarily be true. It's why I'm asking). — Not a real meerkat, May 27 '18 at 13:07
@PasserBy as I said, nothing in this question says that the value `(T*)0xDEADBEEF)` is invalid. In the end, I'm simply looking for normative that says it is, I guess. That's the whole point in the question. — Not a real meerkat, May 27 '18 at 13:13

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

4

Your C-style cast is performing a reinterpret_cast; casting an arbitrary integer value like this is implementation-defined:

8.5.1.10 Reinterpret cast

5 A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in [basic.stc.dynamic.safety], the result of such a conversion will not be a safely-derived pointer value. — end note ]

If the result is (deemed by the implementation to be) an invalid pointer value, then it’s probably again implementation-defined what happens when it is stored in a variable, but that’s less clear:

6.6.4 Storage duration

4 When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

edited Jun 20 '20 at 09:12

Community

1
1

answered May 27 '18 at 03:14

Davis Herring

36,443
4
48
76

1

[basic.stc]/4 covers use of an invalid pointer value . Also you could mention [basic.stc.dynamic.safety]/4 which covers that it's i-d whether or not there is *strict pointer safety* (in which this pointer is always invalid) or *relaxed pointer safety* in which it may or may not be invalid – M.M May 27 '18 at 05:05
@M.M: I added the link, which seemed extra at first given that I had just answered that in the inspiring question. As for strict pointer safety, that applies only to objects of dynamic storage duration that have not been marked reachable. – Davis Herring May 27 '18 at 05:25
@M.M I believe pointer safety is used for implementations with GC, which is to say, none. – Passer By May 27 '18 at 11:34
@DavisHerring The "(Deemed by the implementation to be)" did it for me. If the result of the cast is implementation defined, then surely its validity also is. I guess that is what I was missing. Thanks a lot! – Not a real meerkat May 27 '18 at 13:16

R Sahu · Answer 2 · 2018-05-28T04:39:42.323

Speaking from my experience with desktop (UNIX, Linux, Windows) applications, I think you can assign any value to a pointer. If you don't dereference the pointer, these systems don't cause strange behavior when you assign such values to a pointer.

The following example show one mechanism I have seen to deal with saving pointers to disk and restoring pointers from disk.

Let's take a simplified view of the connections between faces, edges, and vertices of a CAD model.

struct Face;
struct Edge;
struct Vertex;

struct Face
{
   std::vector<Edge*> edges;
};

struct Edge
{
   std::vector<Face*> faces;
   Vertex* start;
   Vertex* end;
};

struct Vertex
{
   std::vector<Edge*> edges;
   double x;
   double y;
   double z;
};

and you have a face in the XY plane.

       E3
V4 +--------+ V3
   |        |
E4 |    F   | E2
   |        |
   +--------+
 V1     E1   V2

Such a face could be saved to disk using the following format.

0 Face 4 $1 $2 $3 $4
1 Edge 1 $0 $5 $6
2 Edge 1 $0 $6 $7 
3 Edge 1 $0 $7 $8
4 Edge 1 $0 $8 $5
5 Vertex 2 $1 $4 0 0 0 
6 Vertex 2 $2 $1 10 0 0 
7 Vertex 2 $3 $2 10 10 0 
8 Vertex 2 $4 $3 0 10 0

where the first number indicates an index in an array of objects while the fields that have $ prefix is a pointer to the item at that index.

When that information is read from disk, there are two passes to restore the objects to a usable state. In the first pass, the indices are stored in place of pointers. In the second pass, the indices are converted to pointers. In the first pass, the numbers [0 - 8] are stored where pointers are expected. Only after the second pass will the pointers point to the appropriate objects.

Long story sort, the pointer member variables are assigned values that are obviously not valid pointers but the mechanism works flawlessly.

Whether that would be a problem for other platforms, I cannot comment. I have no experience to fall back on.

There are processors where pointers are validated when loaded into a register, before they are used to access memory. (Example: 80286 in protected segmented mode.) Such processors would fault upon merely loading an invalid pointer value in a register. — Raymond Chen, May 27 '18 at 03:51
@M.M, I thought it did. *If you don't dereference the pointer, these systems don't cause strange behavior when you assign such values to a pointer.* — R Sahu, May 27 '18 at 06:37
The question is about what the defined behaviour is according to the standard; not about any particular system or class of systems — M.M, May 27 '18 at 10:29
I upvoted it. While it doesn't answer the question *per se*, it shows a nice "bonus" gained from implementation defined behavior (As opposed to undefined behavior): If you know your target compiler/architecture, it can be perfectly fine to rely on it. — Not a real meerkat, May 27 '18 at 13:21

Is initializing a pointer with an arbitrary, literal, non-zero value defined?

6.3.2.3 Pointers

6.7 Storage duration

2 Answers2

8.5.1.10 Reinterpret cast

6.6.4 Storage duration