33

For subtraction of pointers i and j to elements of the same array object the note in [expr.add#5] reads:

[ Note: If the value i−j is not in the range of representable values of type std​::​ptrdiff_­t, the behavior is undefined. — end note ]

But given [support.types.layout#2], which states that (emphasis mine):

  1. The type ptrdiff_­t is an implementation-defined signed integer type that can hold the difference of two subscripts in an array object, as described in [expr.add].

Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?

PS: I apologize if my question is caused by my poor understanding of the English language.

EDIT: Related: Why is the maximum size of an array "too large"?

jotik
  • 17,044
  • 13
  • 58
  • 123
  • 1
    On many popular architectures (most <=32 bit platforms) it would be rather difficult and expensive to provide ptrdiff_t that can always hold `i-j` (and they do not in fact provide such ptrdiff_t). The intent of the standard is not to make stuff difficult and expensive, or to make most existing implementations non-conforming, but rather the oposite. So yeah, it "can hold the difference"... when it can. – n. m. could be an AI Mar 20 '18 at 09:41
  • Yes, the second quote says that - if `i` and `j` are valid indices for the same array, that a `ptrdiff_t` can represent the result of `i - j`. The first quote amounts to the reverse requirement - that `i - j` must also be able to be represented in a `ptrdiff_t` or the behaviour is undefined (I'd argue the first note is redundant given the presence of the second, but it probably reduces opportunities for language lawyers to find obscure exploitable loopholes in the language). – Peter Mar 20 '18 at 09:42
  • One thing that is implicit is that as far as I understand problems can only occur if it is an array of a type with sizeof=1 (like char). (Or is there some corner case for sizeof=2 as well?) – Hans Olsson Mar 26 '18 at 10:30

2 Answers2

5

Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?

Yes, but it's unlikely.

In fact, [support.types.layout]/2 does not say much except the proper rules about pointers subtraction and ptrdiff_t are defined in [expr.add]. So let us see this section.

[expr.add]/5

When two pointers to elements of the same array object are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std​::​ptrdiff_­t in the <cstddef> header.

First of all, note that the case where i and j are subscript indexes of different arrays is not considered. This allows to treat i-j as P-Q would be where P is a pointer to the element of an array at subscript i and Q is a pointer to the element of the same array at subscript j. In deed, subtracting two pointers to elements of different arrays is undefined behavior:

[expr.add]/5

If the expressions P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i−j ; otherwise, the behavior is undefined.

As a conclusion, with the notation defined previously, i-j and P-Q are defined to have the same value, with the latter being of type std::ptrdiff_t. But nothing is said about the possibility for this type to hold such a value. This question can, however, be answered with the help of std::numeric_limits; especially, one can detect if an array some_array is too big for std::ptrdiff_t to hold all index differences:

static_assert(std::numeric_limits<std::ptrdiff_t>::max() > sizeof(some_array)/sizeof(some_array[0]),
    "some_array is too big, subtracting its first and one-past-the-end element indexes "
    "or pointers would lead to undefined behavior as per [expr.add]/5."
);

Now, on usual target, this would usually not happen as sizeof(std::ptrdiff_t) == sizeof(void*); which means an array would need to be stupidly big for ptrdiff_t to overflow. But there is no guarantee of it.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • In other words, if `i` and `j` are pointers to elements of the same array, the result of `i-j` is always in the range of representable values of type `std::ptrdiff_t`? – jotik Mar 20 '18 at 09:36
  • 2
    No, support.types.layout doesn't say any such thing. It says "... as described in expr.add"" and expr.add describes a case where i-j is not representable l. – n. m. could be an AI Mar 20 '18 at 09:54
  • 4
    Interesting aspect about: Assuming size_t and ptrdiff_t having the same size, as the former is unsigned, but the latter is signed, it could be used to define arrays larger than what a pointer difference could hold. What does the standard say about such? If *any* difference *must* be representable, we'd have to conclude (possibly without actually being mentioned) that array sizes must not exceed `std::numeric_limits::max()`, no matter if `std::size_t` is capable to hold such values or not... – Aconcagua Mar 20 '18 at 10:02
  • 1
    note: `std::size_t` as subscript index was an error anyway. We might see a standard `std::index` defined as a signed integer with `sizeof(std::index) == sizeof(void*)` someday. – YSC Mar 20 '18 at 10:04
  • @Aconcagua That's always been my personal interpretation of those quotes. – Bob__ Mar 20 '18 at 10:04
  • There are always numeric_limits. – n. m. could be an AI Mar 20 '18 at 10:11
  • @YSC Then imagine we are on modern 64-bit linux with sizeof(unsigned long) == sizeof(size_t) == 8, would I then have to conclude that unsigned long as array subscript is an error, too? Or older 32-bit OS, sizeof(size_t) == sizeof(unsigned int) == 4, then I wouldn't even be able to use unsigned int as subscript - getting to microcontrollers, we might end up in not being able to use *any* unsigned subscript at all... – Aconcagua Mar 20 '18 at 10:12
  • @Aconcagua if it must be representable, then the note in comp.add/5 makes no sense, as it describes an impossible condition. Notes are not normative, but still if we detect that our interpretation of the standard makes a non-normative part meaningless, we must assume that perhaps our interpretatiin is not the only possible one and may in fact be incorrect. Google *PTRDIFF_MAX problems* to see that there are in fact different interpretations assumed by actual implementations. – n. m. could be an AI Mar 20 '18 at 10:23
  • 1
    @YSC If you happen to be aware of, could you cite where the standard prohibits using size_t as array subscript? I can't agree on with my *personal* reasoning, as exceeding array bounds is already UB anyway and with the limitation of array size to what a ptrdiff_t can hold, there is no further need for such limitation... – Aconcagua Mar 20 '18 at 10:32
  • @Aconcagua no it does not. But there is discussion about using another type than `std::size_t` as subscript type for the standard library in future versions. – YSC Mar 20 '18 at 10:33
  • 1
    @n.m. I. e. such large arrays *are* allowed, but calculating pointer differences within *do* lead to UB, *if* distances of elements are too large? – Aconcagua Mar 20 '18 at 10:51
  • 3
    @Aconcagua It would be rather impractical to disallow "large" arrays on e.g. 16-bit platforms. You would have a choice between 64K bytes of data memory but only 32K bytes max array size, or a wider than 16 bit ptrdiff_t, both of which are undesirable. So UB it is. – n. m. could be an AI Mar 20 '18 at 11:40
  • 2
    This doesn't really answer the question – M.M Mar 23 '18 at 11:21
  • 1
    Why would you write `sizeof(decltype(some_array[0])` instead of `sizeof some_array[0]` – M.M Mar 24 '18 at 03:21
  • @M.M I got a new keyboard and I'm excited to write superfluous words :D all jokes aside, that was a mistake. – YSC Mar 26 '18 at 07:56
  • *"Now, on usual target, this would not happen as sizeof(std::ptrdiff_t) == sizeof(void * )."* This seems not to consider the sign bit, while the previous snippet does. – Bob__ Mar 26 '18 at 08:46
  • @Bob__ it just means `ptrdiff_t` can hold really large numbers on usual targets (32 or 64 bits), not that it can hold any `void*` difference (remember, you can only subtract pointers to elements of the same array). – YSC Mar 26 '18 at 08:50
  • For some bizarre reason, C11 no longer allows a 16-bit `ptrdiff_t`, even on a freestanding implementation where the total amount of storage is less than 32K. – supercat Sep 15 '18 at 01:49
1

I think it is a bug of the wordings.

The rule in [expr.add] is inherited from the same rule for pointer subtraction in the C standard. In the C standard, ptrdiff_t is not required to hold any difference of two subscripts in an array object.

The rule in [support.types.layout] comes from Core Language Issue 1122. It added direct definitions for std::size_t and std::ptrdiff_t, which is supposed to solve the problem of circular definition. I don't see there is any reason (at least not mentioned in any official document) to make std::ptrdiff_t hold any difference of two subscripts in an array object. I guess it just uses an improper definition to solve the circular definition issue.

As another evidence, [diff.library] does not mention any difference between std::ptrdiff_t in C++ and ptrdiff_t in C. Since in C ptrdiff_t has no such constraint, in C++ std::ptrdiff_t should not have such constraint too.

xskxzr
  • 12,442
  • 12
  • 37
  • 77
  • 1
    It might also be worth to note that the C standard explicitly lists the undefined behavior caused by pointer subtraction (if the result does not fit in an `ptrdiff_t`) in the informative Annex J under "J.2 Undefined behavior". – jotik Mar 23 '18 at 16:19