"is not required" == undefined behavior?

Question

My question is mainly about terminology and how to interpret the standard.

The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:

(4.1) If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.

(4.2) If two pointers point to different non-static data members of the same object, or to subobjects of such members, recursively, the pointer to the later declared member is required to compare greater provided the two members have the same access control ([class.access]), neither member is a subobject of zero size, and their class is not a union.

(4.3) Otherwise, neither pointer is required to compare greater than the other.

I am little confused as how to interpret (4.3). Does that mean that this

#include <iostream>
int main() {
    int x;
    int y;
    std::cout << (&x < &y);
    std::cout << (&x < &y);
}

is...

valid C++ code and the output is either 11 or 00.
invalid code, because it has undefined behaivour

?

In other words, I know that (4.3) does apply here, but I am not sure about the implications. When the standard says "it can be either A or B" is this the same as saying "it is undefined" ?

I'm going to guess this means indeterminate. i.e. the same expression could give different results in the same program. If it was UB, I *feel* like the wording would say so :) — cigien, Jul 28 '20 at 22:14
@cigien didnt consider that before, I modified the example slightly — 463035818_is_not_an_ai, Jul 28 '20 at 22:16
There is a third option: valid C++ code, but the output can be any combination of 0/1 and 0/1. I'm not saying that's how it should be interpreted, just that it's a plausible interpretation (more so than UB, IMO) that's missing. — user4815162342, Jul 28 '20 at 22:24
The wording has changed in different editions of the standard. C++11 says "Other pointer comparisons are unspecified". C++17 says "Otherwise, neither pointer compares greater than the other." The cited wording "Otherwise, neither pointer is required to compare greater than the other." is from a very recent draft. (C explicitly says the behavior is undefined. This is of course not directly relevant to C++, but it suggests the C++ committee deliberately decided not to say that.) — Keith Thompson, Jul 28 '20 at 22:25
The git commit that added the current wording, dated 2018-03-31, says "[expr.rel] Clarify auxiliary partial ordering (#1977)". Hopefully someone can find out what "#1977" refers to. — Keith Thompson, Jul 28 '20 at 22:28
Found it. "#1977" refers to [this pull request](https://github.com/cplusplus/draft/pull/1977), which was merged with [this pull request](https://github.com/cplusplus/draft/issues/1938). (Drafts of the ISO C++ standard are maintained here: https://github.com/cplusplus/draft ) I'm still not sure there's enough information there to answer the question. — Keith Thompson, Jul 28 '20 at 22:47
*[This is](https://devblogs.microsoft.com/oldnewthing/20200728-00/?p=104012) one of the reasons why the C and C++ programming languages do not specify the result of comparisons between pointers that do not point to elements of the same array* — GSerg, Jul 29 '20 at 06:54
In C, this comparison is actually undefined per C17 6.5.8p5. — dbush, Jul 29 '20 at 14:31

score 26 · Accepted Answer · answered Jul 28 '20 at 23:02

26

The wording has changed in various editions of the C++ standard, and in the recent draft cited in the question. (See my comments on the question for the gory details.)

C++11 says:

Other pointer comparisons are unspecified.

C++17 says:

Otherwise, neither pointer compares greater than the other.

The latest draft, cited in the question, says:

Otherwise, neither pointer is required to compare greater than the other.

That change was made in response to an issue saying ""compares greater" term is needlessly confusing".

If you look at the surrounding context in the draft standard, it's clear that in the remaining cases the result is unspecified. Quoting from [expr.rel] (text in italics is my summary):

The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:

[pointers to elements of the same array]

[pointers to members of the same object]

[remaining cases] Otherwise, neither pointer is required to compare greater than the other.

If two operands p and q compare equal, p<=q and p>=q both yield true and p<q and p>q both yield false. Otherwise, if a pointer p compares greater than a pointer q, p>=q, p>q, q<=p, and q<p all yield true and p<=q, p<q, q>=p, and q>p all yield false. Otherwise, the result of each of the operators is unspecified.

So the result of the < operator in such cases is unspecified, but it does not have undefined behavior. It can be either true or false, but I don't believe it's required to be consistent. The program's output could be any of 00, 01, 10, or 11.

answered Jul 28 '20 at 23:02

Keith Thompson

254,901
44
429
631

1

That partial order really does a terrible job of explaining the comparison mechanics. In a partial order, if there's no order relation between elements `p` and `q`, then `pq`, `p<=q`, and `p>=q` are all false. Defining a partial order sets up an expectation that comparisons will behave that way, and the text only says it behaves differently at the very end. – user2357112 Jul 29 '20 at 07:29
@user2357112 it bascially just says that either you get the specified results, or the result is unspecified – 463035818_is_not_an_ai Jul 29 '20 at 10:28
@idclev463035818: I understand what the standard says, but they've chosen a needlessly confusing way to say it. – user2357112 Jul 29 '20 at 11:07
1

@user2357112supportsMonica: IMHO, the Standard should predefine a macro to indicate what if anything is guaranteed about comparisons between unrelated pointers, with possibilities including "anything might happen", "any such comparison may independently yield 0 or 1", "any pair of unrelated pointers may be ranked in either order, not necessarily transitively", or "comparisons among arbitrary pointers yield a transitive ranking". What level of guarantees would be best would vary depending upon the target platform and what a programmer is trying to do. – supercat Jul 29 '20 at 21:09
@user2357112supportsMonica "if there's no order relation between elements p and q" then p op q has not been given in the context--it has not been given as "false". A partial order is one where p op q is not given for every p-q pair. – philipxy Jul 29 '20 at 21:54
1

@philipxy: A [partial order](https://mathworld.wolfram.com/PartialOrder.html) is a standard mathematical concept that doesn't match your description. A partial order is a binary relation that is antisymmetric, transitive, and (depending on your definition) either symmetric or antisymmetric. For example, divisibility is a partial order on positive integers. Note that the truth value of "x is divisible by y" is defined for all positive integers x and y, even if neither is divisible by the other. "2 is divisible by 3" and "3 is divisible by 2" are both false, not "not given". – user2357112 Jul 29 '20 at 22:19
@user2357112supportsMonica Yes it is a standard mathematical concept & it is just as I said in my last comment. Read your own reference. The case where every possible input pair is defined is a special case of a partial order. (Also you are not clear about what you mean by "x is divisible by y", but it's moot.) – philipxy Jul 29 '20 at 22:41
(bleh, meant to say "either reflexive or irreflexive" after antisymmetric and transitive) – user2357112 Jul 29 '20 at 23:08
@user2357112supportsMonica: The problem is there really isn't a term to describe the concept described by the Standard: a not-necessarily-transitive ordering relation achieved by taking a partial ordering, but mapping"unordered" to a non-deterministic union of "a is greater", "b is greater", or "both are equal". Such a relation would be adequate to ensure that it's not possible for two objects to be strongly ordered with a particular one first (which would match what memmove-style functions would need to know), without having to know or care if they were strongly ordered the other way. – supercat Jul 30 '20 at 17:18
Why does it not specify that the result has to be consistent? This is begging for shenanigans when two unrelated pointers are in an ordered data-structure. – Jann Poppinga Aug 07 '20 at 11:02

cigien · Answer 2 · 2020-07-28T22:41:57.333

7

For the provided code, this case applies:

(4.3) Otherwise, neither pointer is required to compare greater than the other.

There is no mention of UB, and so a strict reading of "neither is required" suggests that the result of the comparison could be different every time it's evaluated.

This means the program could validly output any of the following results:

edited Jul 28 '20 at 22:41

answered Jul 28 '20 at 22:18

cigien

57,834
11
73
112

3

You'd be hard pressed to find a compiler that produces the 01 and 10 cases, though. – Jeffrey Jul 28 '20 at 22:49
1

@Jeffrey I don't doubt it. Still, a conforming compiler *could* :) If it went out of its way I guess. – cigien Jul 28 '20 at 22:51
2

@Jeffrey The compiler bundled with DS9K probably does. – eerorika Jul 28 '20 at 22:52
3

(DS9K = DeathStation 9000, a hypothetical computer which is as unhelpful as possible when it comes to undefined behaviour. On a DS9K, undefined behaviour shoots a death ray or causes global warming or makes demons fly out of the user's nose.) – user253751 Jul 29 '20 at 11:07
@eerorika: I think people overestimate the range of programs that DS9K would process usefully. In fact, a proper DS9K would behave in meaningless fashion with any source file that doesn't exercise at least some of the translation limits listed in the Standard. – supercat Jul 30 '20 at 17:20
@supercat Standard says that implementation limits are allowed. Some minimum limits are listed, but are not required to be satisfied for conformance. Conclusion: Implementation that only accepts source files with length of one character are allowed, is conforming. – eerorika Jul 30 '20 at 17:25
@eerorika: If an implementation is conforming, then for each and every translation limit there must be at least one (possibly contrived and useless) source file which exercises that limit, and which the implementation would process as described by the Standard. The ability to process any useful programs is thus a "quality of implementation" issue--a quirk the authors of the Standard recognized in the published Rationale document. – supercat Jul 30 '20 at 18:47

score 4 · Answer 3 · edited Jul 29 '20 at 08:38

valid C++ code

Yes.

Nowhere does the standard say that this is UB or ill-formed, and neither is this case lacking a rule describing the behaviour because the quoted 4.3 applies.

and the output is either 11 or 00

I'm not sure that 10 or 01 are technically guaranteed to not be output ¹.

Given that neither pointer is required to compare greater than the other, the result of the comparison can be either true of false. There appears to not be an explicit requirement for the result to be the same for each invocation on same operands in this case.

¹ But I consider this unlikely in practice. I also think that leaving such possibility open is not intentional. Rather, the intention is to allow for deterministic, but not necessarily total order.

P.S.

auto comp = std::less<>;

std::cout << comp(&x, &y);
std::cout << comp(&x, &y);

would be guaranteed to be either 11 or 00 because std::less (like its friends) is guaranteed to impose a strict total order for pointers.

In the case where both pointers are produced by identical expressions, it would be very weird for the comparisons to behave differently, but if e.g. `x` and `z` were pointers that were derived differently but compare equal (e.g. one was computed as a pointer to an object, and the other was computed as pointer just past an object that happens to immediately precede that object in memory), then for `x — supercat, Jul 29 '20 at 17:17

Remy Lebeau · Answer 4 · 2020-07-28T22:27:18.130

2

x and y are not part of the same array, per (4.1). And they are not members of the same object, per (4.2). So, you fall into (4.3), which means if you try to compare them to each other, the result of the comparison is indeterminate, it could be true or false. If it were undefined behavior instead, the standard would likely state that explicitly.

edited Jul 28 '20 at 22:27

answered Jul 28 '20 at 22:18

Remy Lebeau

555,201
31
458
770

added clarification to the question. I know that 4.3. applies, but not sure if "the result is either true or false" implies that I am not allowed to use the result – 463035818_is_not_an_ai Jul 28 '20 at 22:21
Of course not, because (4.3) doesn't say what the result will be, so you can't reliably use it for anything at all. – Remy Lebeau Jul 28 '20 at 22:23
The result not being useful is not the same as not being allowed to use it though :) – cigien Jul 28 '20 at 22:25
3

I think "indeterminate" is what I was looking for. If I understand this answer correctly https://stackoverflow.com/a/11369012/4117728 I can use the result, but it is indeterminate... – 463035818_is_not_an_ai Jul 28 '20 at 22:25
2

@idclev463035818 *"If an indeterminate value is produced by an evaluation, the behavior is undefined*" (except in cases following that rule which don't apply to this). That said, I don't think standard specifies this to be indeterminate. Only case that standard specifies as indeterminate (that I found) is *"When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced "*. – eerorika Jul 28 '20 at 22:31
1

Yes, I think *indeterminate* is the wrong terminology here. Maybe it should be *unspecified* but I'm not sure. – cigien Jul 28 '20 at 23:28

"is not required" == undefined behavior?

4 Answers4