What makes it possible for glibc malloc to compare pointers from different "objects"?

Question

Comparing pointers with a relational operator (e.g. <, <=, >= or >) is only defined by the C standard when the pointers both point to within the same aggregate object (struct, array or union). This in practise means that a comparison in the shape of

if (start_object <= my_pointer && my_pointer < end_object+1) {

can be turned into

if (1) {

by an optimising compiler. Despite this, in K&R, section 8.7 "Example—A Storage Allocator", the authors make comparisons similar to the one above. They excuse this by saying

There is still one assumption, however, that pointers to different blocks returned by sbrk can be meaningfully compared. This is not guaranteed by the standard, which permits pointer comparisons only within an array. Thus this version of malloc is portable only among machines for which general pointer comparison is meaningful.

Furthermore, it appears the implementation of malloc used in glibc does the same thing!

What's worse is – the reason I stumbled across this to begin with is – for a school assignment I'm supposed to implement a rudimentary malloc like function, and the instructions for the assignment requires us to use the K&R code, but we have to replace the sbrk call with a call to mmap!

While comparing pointers from different sbrk calls is probably undefined, it is also only slightly dubious, since you have some sort of mental intuition that the returned pointers should come from sort of the same region of memory. Pointers returned by different mmap calls have, as far as I understand, no guarantee to even be remotely similar to each other, and consolidating/merging memory blocks across mmap calls should be highly illegal (and it appears glibc avoids this, resorting to only merging memory returned by sbrk or internally inside mmap pages, not across them), yet the assignment requires this.

Question: could someone shine some light on

Whether or not comparing pointers from different calls to sbrk may be optimised away, and
If so, what glibc does that lets them get away with it.

Note that glibc *does* call `mmap()` rather than `sbrk` for requests larger than `mmap_threshold`. — abligh, Nov 25 '16 at 17:18
I've taken the liberty of adding the `language-lawyer` tag, as this concerns (in my view) a rather subtle point about pointer comparisons in C. — abligh, Nov 25 '16 at 17:39
For what it's worth, the situation with equality operator comparisons appears to be even weirder: http://stackoverflow.com/questions/40821585/equality-comparison-of-pointers-to-different-objects — abligh, Nov 26 '16 at 18:40

abligh · Accepted Answer · 2016-11-26T18:05:58.580

The language lawyer answer is (I believe) to be found in §6.5.8.5 of the C99 standard (or more precisely from ISO/IEC 9899:TC3 Committee Draft — Septermber 7, 2007 WG14/N1256 which is nearly identical but I don't have the original to hand) which has the following with regard to relational operators (i.e. <, <=, >, >=):

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

(the C11 text is identical or near identical)

This at first seems unhelpful, or at least suggests the that the implementations each exploit undefined behaviour. I think, however, you can either rationalise the behaviour or use a work around.

The C pointers specified are either going to be NULL, or derived from taking the address of an object with &, or by pointer arithmetic, or by the result of some function. In the case concerned, they are derived by the result of the sbrk or mmap system calls. What do these systems calls really return? At a register level, they return an integer with the size uintptr_t (or intptr_t). It is in fact the system call interface which is casting them to a pointer. As we know casts between pointers and uintptr_t (or intptr_t) are by definition of the type bijective, we know we could cast the pointers to uintptr_t (for instance) and compare them, which will impose a well order relation on the pointers. The wikipedia link gives more information, but this will in essence ensure that every comparison is well defined as well as other useful properties such as a<b and b<c implies a<c. (I also can't choose an entirely arbitrary order as it would need to satisfy the other requirements of C99 §6.5.8.5 which pretty much leaves me with intptr_t and uintptr_t as candidates.)

We can exploit this to and write the (arguably better):

if ((uintptr_t)start_object <= (uintptr_t)my_pointer && (uintptr_t)my_pointer < (uintptr_t)(end_object+1)) {

There is a nit here. You'll note I casted to uintptr_t and not intptr_t. Why was that the right choice? In fact why did I not choose a rather bizarre ordering such as reversing the bits and comparing? The assumption here is that I'm chosing the same ordering as the kernel, specifically that my definition of < (given by the ordering) is such that the start and end of any allocated memory block will always be such that start < end. On all modern platforms I know, there is no 'wrap around' (e.g. the kernel will not allocate 32 bit memory starting at 0xffff8000 and ending at 0x00007ffff) - though note that similar wrap around has been exploited in the past.

The C standard specifies that pointer comparisons give undefined results under many circumstances. However, here you are building your own pointers out of integers returned by system calls. You can therefore either compare the integers, or compare the the pointers by casting them back to integers (exploiting the bijective nature of the cast). If you merely compare the pointers, you rely on the C compiler's implementation of pointer comparison being sane, which it almost certainly is, but is not guaranteed.

Are the possibilities I mention so obscure that they can be discounted? Nope, let's find a platform example where they might be important: 8086. It's possible to imagine an 8086 compilation model where every pointer is a 'far' pointer (i.e. contains a segment register). Pointer comparison could do a < or > on the segment register and only if they are equal do a < or > onto the offset. This would be entirely legitimate so long as all the structures in C99 §6.5.8.5 are in the same segment. But it won't work as one might expect between segments as 1000:1234 (which is equal to 1010:1134 in memory address) will appear smaller than 1010:0123. mmap here might well return results in different segments. Similarly one could think of another memory model where the segment register is actually a selector, and a pointer comparison uses a processor comparison instruction was used to compare memory addresses which aborts if an invalid selector or an offset outside a segment is used.

You ask two specific questions:

Whether or not comparing pointers from different calls to sbrk may be optimised away, and

If so, what glibc does that lets them get away with it.

In the formulation given above where start_object etc. are actually void *, then the calculation may be optimized away (i.e. might do what you want) but is not guaranteed to do so as the behaviour is undefined. A cast would guarantee that it does so provided the kernel uses the same well ordering as implied by the cast.

In answer to the second question, glibc is relying on a behaviour of the C compiler which is technically not required, but is very likely (per the above).

Note also (at least in the K&R in front of me) that the line you quote doesn't exist in the code. The caveat is in relation to the comparison of header * pointers with < (as I far as I can see comparison of void * pointers with < is always UB) which may derive from separate sbrk() calls.

*If you merely compare the pointers, you rely on the C compiler's implementation of pointer comparison being sane, which it almost certainly is, but is not guaranteed.* I don't follow, why would comparing two `char*` not result in a well-defined comparison between two memory addresses? That is not compiler dependent, it's a basic property of the C language framework. — Byte Lab, Nov 25 '16 at 17:43
@DIMMSum not as far as the standard is concerned. Per the standard, if you get two `char *` pointers derived other than per the quoted paragraph (perhaps they are the returns from two `malloc()` calls or they are addresses of separate variables), the compiler is entitled to produce a comparison result which is (e.g.) the least significant bit of the number of seconds past the hour, or to cause a `SEGV`. The result is undefined. Of course *most* compilers (perhaps every one you can find) operate as if the pointers are integers. — abligh, Nov 25 '16 at 17:47
*If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values.* The region of memory allocated by `sbrk` can be considered an array of contiguous bytes. Therefore, `char *`'s at various points of that region are pointing to "structure members" inside of an aggregate object, so comparison between pointers is well defined. — Byte Lab, Nov 25 '16 at 17:51
@DIMMSum firstly memory allocated with `sbrk()` is not an `array of contiguous bytes` as per the C standard - it's simply an integer cast to a pointer; but comparing those pointers with `<` is UB because from C's point of view, it's not an array. Secondly, that doesn't address the issue with `mmap`, where addresses returned from two `mmap` calls are certainly not necessarily contiguous. Per the C standard, you have no way of knowing that comparing such pointers with `<` will not cause UB. Of course in any sane compiler, they won't. — abligh, Nov 25 '16 at 17:57
*firstly memory allocated with sbrk() is not an array of contiguous bytes as per the C standard - it's simply an integer cast to a pointer; but comparing those pointers with < is UB because from C's point of view, it's not an array.*. ...yes it is. It is a contiguous sequence of bytes in memory. That is the definition of an array. I agree that comparing addresses returned from `mmap` is undefined, but the same is not true of `sbrk`. — Byte Lab, Nov 25 '16 at 18:00
@DIMMSum: I don't think so per the standard. I think you may be confusing arrays and pointers. The C compiler can't know at compile time that the returns from `sbrk` if interpreted as pointers point to something that behaves like an array (let alone is one); therefore I don't believe it's entitled to assume it's an 'aggregate object'. I added the `language-lawyer` tag as I think this point is quite subtle. — abligh, Nov 25 '16 at 18:03
Wraparound of an object from a high unsigned address to a low unsigned address passes through the zero (`NULL`) address. The kilobytes or megabytes starting at 0 are usually intentionally unmapped in order to trigger SIGSEGVs on accesses there, so no allocator can return memory spanning that no-man's-land. — Iwillnotexist Idonotexist, Nov 25 '16 at 18:31
@IwillnotexistIdonotexist sure, but the C standard does not guarantee `NULL` (or pointer `0`) is translated to integer 0. It could be (e.g.) `MAXINT`. See http://stackoverflow.com/questions/9894013/is-null-always-zero-in-c (accepted answer). — abligh, Nov 25 '16 at 18:35
I'm well aware of that, but the practicality of having a `NULL` pointer be represented by all-zero bits is so immense, and the disadvantages of not doing that are so glaring, that system implementors almost invariably choose to do so, much like they choose binary over ternary. Doing otherwise would obsolete `.bss` sections, for instance, since you could no longer rely on COWed and/or `memset()`'ed zero pages to back statically-initialized-to-zero objects. — Iwillnotexist Idonotexist, Nov 25 '16 at 18:48
@IwillnotexistIdonotexist absolutely, and similarly with having universally comparable pointer types. However, the question starts by referencing the C standard (which does not mandate either behaviour) then quotes a K&R proviso specifically handling the outlier case. I believe the C standard *does* mandate binary over ternary though :-) — abligh, Nov 25 '16 at 19:02
@abligh Do I interpret this answer as "any two pointers on any two systems can be compared, although perhaps not meaningfully, by first casting them to `uintptr_t`"? Thus this being a general way of sidestepping the standards lack of definition for comparing pointers across different objects? — kqr, Nov 26 '16 at 09:11
@kqr yes, casting to `uintptr_t` is guaranteed to be defined on any system that supports `uintptr_t`, will impose a well order relation (which will give you comparability), and will ensure comparison is not UB. Whether it's the correct well order relation is up for debate (the only other likely candidate being `intptr_t`), but I would bet it is correct. Note `uintptr_t` is technically optional in C99 on non-XSI compatible systems: http://pubs.opengroup.org/onlinepubs/000095399/basedefs/stdint.h.html (so systems where pointers are incomparable could still exist, but you can detect them) — abligh, Nov 26 '16 at 09:20
@kqr also note that showing sufficient awareness to use `uintptr_t` with comparisons, and referring to a C standard, is all that (or more than) would reasonably be required in any commercial or academic setting. If you find sufficiently bizarre a machine it doesn't work, you'd get an error at compile time and go look at the line in question, rather than get UB. — abligh, Nov 26 '16 at 09:21
@abligh Thank you! That's just what I hoped for. Your answer has been above and beyond what I expected to get. Huge thanks! You're Stack Overflow at its best. — kqr, Nov 26 '16 at 10:09
@abligh: There are systems where the ordering imposed by casts through `uintptr_t` would be grossly incompatible with the ordering of pointers [e.g. hardware pointers hold a word addresses, a `char*` combines a hardware pointer with something to identify the byte within a word, and the latter bits are placed higher in a `uintptr_t` than the former. What should have happened would have been for the Standard to specify predefined macros via which implementations could either make promises, or decline to make, promises about the behavior of comparisons of unrelated objects. — supercat, Nov 29 '16 at 15:46
@abligh: Unfortunately, that might have been interpreted as suggesting that implementations that couldn't make useful promises should be regarded as those which can, and the authors of the Standard weren't willing to say anything bad about unusual systems (Annex F implies that some systems' floating point is worse than others, but non-conformance to that wasn't seen as "weird"). — supercat, Nov 29 '16 at 15:47

davmac · Answer 2 · 2016-11-25T18:33:25.063

The answer is simple enough. The C library implementation is written with some knowledge of (or perhaps expectations for) how the C compiler will handle certain code that has undefined behaviour according to the official specification.

There are many examples I could give; but that pointers actually refer to an address in the process' address space and can be compared freely is relied on by the C library implementation (at least by Glibc) and also by many "real world" programs. While it is not guaranteed by the standard for strictly conforming programs, it is true for the vast majority of real-world architectures/compilers. Also, note footnote 67, regarding conversion of pointers to integers and back:

The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.

While this doesn't strictly give license to compare arbitrary pointers, it helps to understand how the rules are supposed to work: as a set of specific behaviours that are certain to be consistent across all platforms, rather than as a limit to what is permissible when the representation of a pointer is fully known and understood.

You've remarked that:

if (start_object <= my_pointer && my_pointer < end_object+1) {

Can be turned into:

if (1) {

With the assumption (which you didn't state) that my_pointer is derived in some way from the value of start_object or the address of the object which it delimits - then this is strictly true, but it's not an optimisation that compilers make in practice except in the case of static/automatic storage duration objects (i.e. objects which the compiler knows weren't dynamically allocated).

score 2 · Answer 3 · edited Jun 20 '20 at 09:12

Consider the fact that calls to sbrk are defined to increment, or decrement the number of bytes allocated in some region (the heap), for some process by the given incr parameter according to some brk address. This is really just a wrapper around brk, which allows you to adjust the current top of the heap. When you call brk(addr), you're telling the kernel to allocate space for your process all the way from the bottom of the addr (or possibly to free space between the current previous higher-address top of the heap down to the new address). sbrk(incr) would be exactly equivalent if incr == new_top - original_top. Thus to answer your question:

Because sbrk just adjusts the size of the heap (i.e. some contiguous region of memory) by incr number of bytes, comparing the values of sbrk is just a comparison of points in some contiguous region of memory. That is exactly equivalent to comparing points in an array, and so it is a well defined operation according to the the C-standard. Therefore, pointer comparison calls around sbrk can be optimized away.
glibc doesn't do anything special to "get away with it" - they just assume that the assumption mentioned above holds true (which it does). In fact, if they're checking the state of a chunk for memory that was allocated with mmap, it explicitly verifies that the mmap'd memory is outside the range allocated with sbrk.

Edit: Something I want to make clearer about my answer: The key here is that there is no undefined behavior! sbrk is defined to allocate bytes in some contiguous region of memory, which is itself an 'object' as specified by the C-standard. Therefore, comparison of pointers within that 'object' is a completely sane and well defined operation. The assumption here is not that glibc is taking advantage of undefined pointer comparison, it's that it's assuming that sbrk grows / shrinks memory in some contiguous region.

As far as the C standard is concerned, that assertion could very well be optimised away to "false", since no pointer is allowed to be compared to the boundaries of an object it does not belong to. — kqr, Nov 26 '16 at 12:25

score 1 · Answer 4 · answered Nov 29 '16 at 01:16

The authors of the C Standard recognized that there are some segmented-memory hardware platforms where an attempt to perform a relational comparison between objects in different segments might behave strangely. Rather than say that such platforms could not efficiently accommodate efficient C implementations, the authors of the Standard allow such implementations to do anything they see fit if an attempt is made to compare pointers to objects that might be in different segments.

For the authors of the Standard to have said that comparisons between disjoint objects should only exhibit strange behavior on such segmented-memory systems that can't efficiently yield consistent behavior would have been seen as implying that such systems were inferior to platforms where relational comparisons between arbitrary pointers will yield a consistent ranking, and the authors of the Standard went out of their way to avoid such implications. Instead, they figured that since there was no reason for implementations targeting commonplace platforms to do anything weird with such comparisons, such implementations would handle them sensibly whether the Standard mandated them or not.

Unfortunately, some people who are more interested in making a compiler that conforms to the Standard than in making one that's useful have decided that any code which isn't written to accommodate the limitations of hardware that has been obsolete for decades should be considered "broken". They claim that their "optimizations" allow programs to be more efficient than would otherwise be possible, but in many cases the "efficiency" gains are only significant in cases where a compiler omits code which is actually necessary. If a programmer works around the compiler's limitations, the resulting code may end up being less efficient than if the compiler hadn't bothered with the "optimization" in the first place.

What makes it possible for glibc malloc to compare pointers from different "objects"?

4 Answers4

Linked