Sorting an array of pointers

Question

Am I right in thinking it is fine to treat a pointer as an int for the purposes of sorting an array of pointers, e.g.

qsort(ptrs, n, sizeof(void*), int_cmp);

I want to sort the ptrs to ascertain whether there are any duplicates, irrespective of the type of thing the pointer is pointing to, so the qsort is a precursor to doing that.

my int_cmp() is pretty standard, e.g.

int int_cmp(const void *a, const void *b)
{
    const int *ia = (const int *)a; // casting pointer types
    const int *ib = (const int *)b;

    /* integer comparison: returns negative if b > a
    and positive if a > b */
    return *ia  - *ib;
}

it seems to work in my unit-tests but is there some reason why considering a ptr as an int may cause problems for such a scenario that i may have overlooked?

No, you can't treat a pointer as an `int`. The size of a pointer may not be the same as the size of an `int`. Also remember that there are many types of data that can't be directly compared, or used in arithmetic expressions (like for example structures). — Some programmer dude, Apr 04 '14 at 12:55
ok so i would just have to write a ptr_cmp() function that cast to a ptr, couldn't be a void ptr as i have to dereference it, so maybe use a char ptr? I want this to be generic so i can sort any array of ptrs irrespective of what they are pointing to.. — bph, Apr 04 '14 at 12:58
The only integer variables that are guaranteed to be able to hold a pointer are `intptr_t` and `uintptr_t`. — Sergey L., Apr 04 '14 at 12:58
That "pretty standard" way of generating a comparison return value by subtraction is also "pretty broken"; it fails to take overflow into account. Don't do it, even for integers. — unwind, Apr 04 '14 at 13:02
@unwind is there a better one somewhere i could take a look at? — bph, Apr 04 '14 at 13:03
shold be `const int *ia = *(const int **)a; const int *ib = *(const int **)b;` — BLUEPIXY, Apr 04 '14 at 13:05
@BLUEPIXY not sure I follow that - is it possible to provide an explanation? i think all i want to do is cast a void ptr to a const int ptr in this instance which i think (const int *)a achieves.. — bph, Apr 04 '14 at 13:15
So, What do you do when an array of `int`, rather than an array of `int *`? — BLUEPIXY, Apr 04 '14 at 13:25
@Hiett That's pretty neat! I think `?:` as in my answer is somewhat easier to understand, but that's probably just because I'm used to it. — unwind, Apr 04 '14 at 13:46
@BLUEPIXY my example cmp function is for an array of ints and note that the args are dereferenced in the comparison statement — bph, Apr 04 '14 at 15:48
_my example cmp function is for an array of ints_ : _array of pointers, e.g. qsort(ptrs, n, sizeof(void*), int_cmp);_ used `int_cmp` for array of pointers. — BLUEPIXY, Apr 04 '14 at 15:52
@BLUEPIXY thanks for the clarification - i think i have it now.. — bph, Apr 04 '14 at 15:56
would it be possible to use a preprocessor macro for the cmp function, say #define CMP(a, b) ((a > b) - (a < b)) — bph, Apr 04 '14 at 16:16

unwind · Accepted Answer · 2014-04-04T14:00:54.813

No, you're not right at all, unless you want to sort the pointers by address. The actual address rarely has any meaning though, so that's very unlikely.

For detecting duplicate pointers, you should just compare the pointers as such, that's well-defined.

I would probably go with a solution using uintptr_t:

static int order_pointers(const void *pa, const void *pb)
{
  const uintptr_t a = *(void **) pa, b = *(void **) pb;

  return a < b ? -1 : a > b;
}

Haven't tested this, but something like that should work.

The conversion to uintptr_t is necessary since you cannot validly compare random pointers. I quoth the C99 draft standard, §6.5.8.5:

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

I bolded the final sentence since that's what applies here.

yes thats exactly what i want to do - sort by *address* in order to then assess whether i have duplicate ptrs in the array, e.g. 2 addresses that are the same — bph, Apr 04 '14 at 13:01
Why the conversion to an integer? See *TripeHound*'s answer. — alk, Apr 04 '14 at 13:55
i'm glad i asked the original question - it isn't that simple after all.. i should maybe have posted the larger question, 'how to remove duplicate ptrs, i.e. ptrs that point to the same underlying objects, from an array of ptrs in a generic way independent of the underlying objects.. — bph, Apr 04 '14 at 14:14
@unwind as an aside, is your default approach to make functions static? i would have thought you would want to use a cmp function like that potentially across many files? — bph, Apr 04 '14 at 15:52
@Hiett I guess it's a bit of a reflex, yes. And it's not always super-easy to re-use `qsort()` callbacks, since they are always pointers to the bare data in the array you're sorting. I would probably encapsulate it and keep it `static`. — unwind, Apr 05 '14 at 15:08

score 2 · Answer 2 · answered Apr 04 '14 at 13:22

2

Providing you use comparison, not subtraction, you can stick with void pointers: the following worked for a simple test:

int int_cmp( const void *pa, const void *pb )
{
    const void* a = *(void**)pa ;
    const void* b = *(void**)pb ;

    if( a < b ) return -1 ;
    if( a > b ) return  1 ;
    return 0 ;
}

answered Apr 04 '14 at 13:22

TripeHound

2,721
23
37

-1, this is undefined behavior. Not to toot my own horn too much, but ... see my answer. – unwind Apr 04 '14 at 14:04
1

@alk Yes, unless you know the pointers are pointing at (or just beyond) the same "object", see the standard quote in my answer. – unwind Apr 04 '14 at 14:22
@unwind: I don't study (or read) the standards, so am happy to bow to deeper knowledge, but if comparing two pointers that might point at anything is undefined, how can be converting them to integers be more well defined? – TripeHound Apr 04 '14 at 14:31
@TripeHound C doesn't guarantee that a pointer is encoded as just a plain integer. There might be segmented memory at play for instance. For such architectures, the compiler has to insert cleverness when doing the conversion to integer to "flatten" the space so that distinct pointers become distinct integers. And there are no limits on which integers can be compared, as far as I know. – unwind Apr 04 '14 at 14:33
And in [this_answer]{http://stackoverflow.com/questions/1845482/what-is-uintptr-t-data-type} they quote: **In C99, it is defined as "an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".** so does that mean comparing intptr_t is also undefined (all you can do is convert them back to void pointers?) – TripeHound Apr 04 '14 at 14:34
@unwind so the"flattening" could, in theory (if someone was being perverse), switch the segment and offset parts (providing of course it did the reverse when casting back to void*): while you can compare the _numbers_ the result _could_ be meaningless. For detecting equality (as OP wanted) you might be safe, but even then I could _imagine_ that the ability for two different `seg:ofs` pairs to point to the same address could mean that the `intptr_t` castings of each wouldn't match. – TripeHound Apr 04 '14 at 14:45
@TripeHound No, `uintptr_t` is defined as an integer type, as you say, and thus it's an arithmetic type. There are no constraints on using relational operators with arithmetic types, that I can see. – unwind Apr 04 '14 at 14:55
@unwind What I'm saying is that the "flattening" process from a pointer to the `uintptr_t` _could_ mean that the comparison of arithmetic types you are always allowed to make could be meaningless in terms of whether either of the original pointers was "larger" than the other. If two `uintptr_t`s compare equal, you can be certain the pointers they came from were equal, but you can't guarantee to say anything about whether they came from a "higher" or "lower" memory address. – TripeHound Apr 04 '14 at 15:06
@TripeHound Ah, yes. That's probably true. Still, since the mapping must be 1:1, I don't think it makes the answer any less good. Distinct pointers will be distinct, and they will be ordered. The order of the `uintptr_t`:s perhaps doesn't match the order of the addresses, but that's fine AFAIK. – unwind Apr 04 '14 at 15:10

Sorting an array of pointers

2 Answers2

Linked