4

(related to my previous question)

In QT, the QMap documentation says:

The key type of a QMap must provide operator<() specifying a total order.

However, in qmap.h, they seem to use something similar to std::less to compare pointers:

/*
    QMap uses qMapLessThanKey() to compare keys. The default
    implementation uses operator<(). For pointer types,
    qMapLessThanKey() casts the pointers to integers before it
    compares them, because operator<() is undefined on pointers
    that come from different memory blocks. (In practice, this
    is only a problem when running a program such as
    BoundsChecker.)
*/

template <class Key> inline bool qMapLessThanKey(const Key &key1, const Key &key2)
{
    return key1 < key2;
}

template <class Ptr> inline bool qMapLessThanKey(const Ptr *key1, const Ptr *key2)
{
    Q_STATIC_ASSERT(sizeof(quintptr) == sizeof(const Ptr *));
    return quintptr(key1) < quintptr(key2);
}

They just cast the pointers to quintptrs (which is the QT-version of uintptr_t, that is, an unsigned int that is capable of storing a pointer) and compare the results.

The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to a pointer to void, and the result will compare equal to the original pointer: uintptr_t

Do you think this implementation of qMapLessThanKey() on pointers is ok?

Of course, there is a total order on integral types. But I think this is not sufficient to conclude that this operation defines a total order on pointers.

I think that it is true only if p1 == p2 implies quintptr(p1) == quintptr(p2), which, AFAIK, is not specified.

As a counterexample of this condition, imagine a target using 40 bits for pointers; it could convert pointers to quintptr, setting the 40 lowest bits to the pointer address and leaving the 24 highest bits unchanged (random). This is sufficient to respect the convertibility between quintptr and pointers, but this does not define a total order for pointers.

What do you think?

Community
  • 1
  • 1
rom1v
  • 2,752
  • 3
  • 21
  • 47
  • Nice question. I think you've answered it yourself, though: The conversion from pointer to integer could produce a different value every time (imagine `to_int(void * p) { return to_int32(p) + rand() << 40; }` – Kerrek SB Jun 03 '15 at 19:20
  • That's ok in theory, but does anyone knows about a such platform? – marom Jun 03 '15 at 19:29
  • 1
    Speaking about your example if a pointer is 40 bits but an unsigned long is 64 bits then the assert will trigger. – NathanOliver Jun 03 '15 at 19:33
  • @NathanOliver Good point :) – rom1v Jun 03 '15 at 19:39
  • @marom compilers on x86 for real and protected mode used [segmented memory model](https://en.wikipedia.org/wiki/X86_memory_segmentation). In this context you have exactly the restrictions of the standard. – Christophe Jun 03 '15 at 19:47
  • @Christophe Like the far/near attributes of Windows 3.11 age? – marom Jun 03 '15 at 19:49
  • @marom yes, and msdos before... – Christophe Jun 03 '15 at 20:25
  • @Christophe But there aren't any (standard) C++ compilers left for 16 bit x86, are there? Nor indeed any OS. – Alan Stokes Jun 03 '15 at 20:43
  • @KerrekSB Conversion of a pointer to a large enough integer is defined in the sense that casting that integer back to a pointer gives the same pointer as the one you started with. You can't randomize that unless the compiler/runtime (in this case Hell++) keeps track of the values generated... – rubenvb Jun 04 '15 at 21:25
  • @rubenvb: The assumption here is that the pointer is narrower than the resulting integer type, so the conversion back reproduces the original pointer. – Kerrek SB Jun 04 '15 at 21:43
  • Also, for the very same reason, I would say that `qHash(const T*, uint)` is not guaranteed to work: `return qHash(reinterpret_cast(key), seed);`. – rom1v Jun 06 '15 at 12:21

2 Answers2

2

The Standard guarantees that converting a pointer to an uintptr_t will yield a value of some unsigned type which, if cast to the original pointer type, will yield the original pointer. It also mandates that any pointer can be decomposed into a sequence of unsigned char values, and that using such a sequence of unsigned char values to construct a pointer will yield the original. Neither guarantee, however, would forbid an implementation from including padding bits within pointer types, nor would either guarantee require that the padding bits behave in any consistent fashion.

If code avoided storing pointers, and instead cast to uintptr_t every pointer returned from malloc, later casting those values back to pointers as required, then the resulting uintptr_t values would form a ranking. The ranking might not have any relationship to the order in which objects were created, nor to their arrangement in memory, but it would be a ranking. If any pointer gets converted to uintptr_t more than once, however, the resulting values might rank entirely independently.

supercat
  • 77,689
  • 9
  • 166
  • 211
1

I think that you can't assume that there is a total order on pointers. The guarantees given by the standard for pointer to int conversions are rather limited:

5.2.10/4: A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined.

5.2.10/5: A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (...) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined.

From a practical point of view, most of the mainstream compilers will convert a pointer to an integer in a bitwise manner, and you'll have a total order.

The theoretical problem:

But this is not guaranteed. It might not work on past platforms (x86 real and protected mode), on exotic platform (embedded systems ?) , and -who knows- on some future platforms (?).

Take the example of segmented memory of the 8086: The real address is given by the combination of a segment (e.g. DS register for data segment, an SS for stack segment,...) and an offest:

Segment:   XXXX YYYY YYYY YYYY 0000    16 bits shifted by 4 bits
Offset:    0000 ZZZZ ZZZZ ZZZZ ZZZZ    16 bits not sifted
           ------------------------
Address:   AAAA AAAA AAAA AAAA AAAA    20 bits address                      

Now imagine that the compiler would convert the pointer to int, by simply doing the address math and put 20 bits in the integer: your safe and have a total order.

But another equally valid approach would be to store the segment on 16 upper bits and the offset on the 16 lower bits. In fact, this way would significantly facilitate/accelerate the load of pointer values into cpu registers.

This approach is compliant with standard c++ requirements, but each single address could be represented by 16 different pointers: your total order is lost !!

**Are there alternatives for the order ? **

One could imagine using pointer arithmetics. There are strong constraints on pointer arithmetics for elements in a same array:

5.7/6: When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements.

And subscripts are ordered.

Array can be of maximum size_t elements. So, naively, if sizeof(pointer) <= sizof(size_t) one could assume that taking an arbitrary reference pointer and doing some pointer arithmetic should lead to a total order.

Unfortunately, here also, the standard is very prudent:

5.7.7: For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T is different from the cv-unqualified array element type, the behavior is undefined.

So pointer arithmetic won't do the trick for arbitrary pointers either. Again, back to the segmented memory models, helps to understand: arrays could have maximum 65535 bytes to fit completely in one segment. But different arrays could use different segments so that pointer arithmetic wouldn't be reliable for a total order either.

Conclusion

There's a subtle note in the standard on the mapping between pointer and interal value:

It is intended to be unsurprising to those who know the addressing structure of the underlying machine.

This means that must be be possible to determine a total order. But keep in mind that it'll be non portable.

Christophe
  • 68,716
  • 7
  • 72
  • 138