I am not a language-lawyer, so I don't know how to answer OP's question, except that a plain reading of the standard does not describe the entire picture.
While the standard says that comparing unrelated pointers yields undefined results, the behaviour of a standards-compliant C compiler is much more restricted.
The first sentence in the section concerning pointer comparison is
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to.
and for a very good reason.
If we examine the possibilities how the pointer comparison code may be used, we find that unless the compiler can determine which objects the compared pointers belong to at compile time, all pointers in the same address space must compare arithmetically, according to the addresses they refer to.
(If we prove that a standards-compliant C compiler is required by the standard to provide specific results when a plain reading of the C standard itself says the results are undefined, is such code standards-compliant or not? I don't know. I only know such code works in practice.)
A literal interpretation of the standard may lead to one believing that there is absolutely no way of determining whether a pointer refers to an array element or not. In particular, observing
int is_within(const char *arr, const size_t len, const char *ptr)
{
return (ptr >= arr) && (ptr < (arr + len));
}
a standards compliant C compiler could decide that because comparison between unrelated pointers is undefined, it is justified in optimizing the above function into
int is_within(const char *arr, const size_t len, const char *ptr)
{
if (size)
return ptr != (arr + len);
else
return 0;
}
which returns 1 for pointers within array const char arr[len]
, and zero at the element just past the end of the array, just like the standard requires; and 1 for all undefined cases.
The problem in that line of thinking arises when a caller, in a separate compilation unit, does e.g.
char buffer[1024];
char *p = buffer + 768;
if (is_within(buffer, (sizeof buffer) / 2, p)) {
/* bug */
} else {
/* correct */
}
Obviously, if the is_within()
function was declared static
(or static inline
), the compiler could examine all call chains that end up in is_within()
, and produce correct code.
However, when is_within()
is in a separate compilation unit compared to its callers, the compiler can no longer make such assumptions: it simply does not, and cannot know, the object boundaries beforehand. Instead, the only way it can be implemented by a standards-compliant C compiler, is to rely on the addresses the pointers refer to, blindly; something like
int is_within(const char *arr, const size_t len, const char *ptr)
{
const uintptr_t start = POINTER_TO_UINTPTR(arr);
const uintptr_t limit = POINTER_TO_UINTPTR(arr + len);
const uintptr_t thing = POINTER_TO_UINTPTR(ptr);
return (thing >= start) && (thing < limit);
}
where the POINTER_TO_UINTPTR()
would be a compiler-internal macro or function, that converts the pointer losslessly to an unsigned integer value (with the intent that there would be a corresponding UINTPTR_TO_POINTER()
that could recover the exact same pointer from the unsigned integer value), without consideration for any optimizations or rules allowed by the C standard.
So, if we assume that the code is compiled in a separate compilation unit to its users, the compiler is forced to generate code that provides more quarantees than a simple reading of the C standard would indicate.
In particular, if arr
and ptr
are in the same address space, the C compiler must generate code that compares the addresses the pointers point to, even if the C standard says that comparison of unrelated pointers yields undefined results; simply because it is at least theoretically possible for an array of objects to occupy any subregion of the address space. The compiler just cannot make assumptions that break conforming C code later on.
In the GNU obstack implementation, the obstacks all exist in the same address space (because of how they are obtained from the OS/kernel). The code assumes that the pointers supplied to it refer to these objects. Although the code does return an error if it detects that a pointer is invalid, it does not guarantee it always detects invalid pointers; thus, we can ignore the invalid pointer cases, and simply assume that because all obstacks are from the same address space, so are all the user-supplied pointers.
There are many architectures with multiple address spaces. x86 with a segmented memory model is one of these. Many microcontrollers have Harvard architecture, with separate address spaces for code and data. Some microcontrollers have a separate address space (different machine instructions) for accessing RAM and flash memory (but capable of executing from both), and so on.
It is even possible for there to be an architecture where each pointer has not only its memory address, but some kind of unique object ID associated with it. This is nothing special; it just means that on such an architecture, each object has their own address space.