Qsort comparison

Question

I'm converting C++ code to Go, but I have difficulties in understanding this comparison function:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>

using namespace std;

typedef struct SensorIndex
{   double value;
    int    index;
} SensorIndex;

int comp(const void *a, const void* b)
{   SensorIndex* x = (SensorIndex*)a;
    SensorIndex* y = (SensorIndex*)b;

    return abs(y->value) - abs(x->value);
}

int main(int argc , char *argv[])
{

    SensorIndex *s_tmp;

    s_tmp = (SensorIndex *)malloc(sizeof(SensorIndex)*200);

    double q[200] = {8.48359,8.41851,-2.53585,1.69949,0.00358129,-3.19341,3.29215,2.68201,-0.443549,-0.140532,1.64661,-1.84908,0.643066,1.53472,2.63785,-0.754417,0.431077,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256,-0.123256};

    for( int i=0; i < 200; ++i ) {
        s_tmp[i].value = q[i];
        s_tmp[i].index = i;
    }

    qsort(s_tmp, 200, sizeof(SensorIndex), comp);

    for( int i=0; i<200; i++)
    {   
        cout << s_tmp[i].index << " " << s_tmp[i].value << endl;
    }

}

I expected that the "comp" function would allow the sorting from the highest (absolute) value to the minor, but in my environment (gcc 32 bit) the result is:

1 8.41851
0 8.48359
2 -2.53585
3 1.69949
11 -1.84908
5 -3.19341
6 3.29215
7 2.68201
10 1.64661
14 2.63785
12 0.643066
13 1.53472
4 0.00358129
9 -0.140532
8 -0.443549
15 -0.754417
16 0.431077
17 -0.123256
18 -0.123256
19 -0.123256
20 -0.123256
...

Moreover one thing that seems strange to me is that by executing the same code with online services I get different values (cpp.sh, C++98):

0 8.48359
1 8.41851
5 -3.19341
6 3.29215
2 -2.53585
7 2.68201
14 2.63785
3 1.69949
10 1.64661
11 -1.84908
13 1.53472
4 0.00358129
8 -0.443549
9 -0.140532
12 0.643066
15 -0.754417
16 0.431077
17 -0.123256
18 -0.123256
19 -0.123256
20 -0.123256
...

Any help?

@Slava Yes it is. It compiles perfectly fine on my `C++` compiler. — Galik, Aug 18 '18 at 00:44
@Galik not eveything that can be compiled by c++ compiler is c++. For example most of C code would be compiled fine — Slava, Aug 18 '18 at 00:46
What he means is that everything you're doing (besides `cout`) is C, not C++. — Major, Aug 18 '18 at 00:46
@Slava TBF there's at least `using namespace std;` and `cout` which qualifies the code at least as C++ and not plain C. — πάντα ῥεῖ, Aug 18 '18 at 00:46
@πάνταῥεῖ yes and there is `cout` but this is still not c++, `malloc`, `qsort`, `typedef struct` — Slava, Aug 18 '18 at 00:47
@Slava *"not eveything that can be compiled by c++ compiler is c++"* - of course it is. — Galik, Aug 18 '18 at 00:47
Sorry, I'm not an expert in C nor in C ++ (my skills in this regard are stopped at the university attended twenty years ago:)) ... Let's say I have this soup to transpile ... and the qsort function sounds strange to me :) — Usin, Aug 18 '18 at 00:50
@Major Language *extensions* notwithstanding, obviously. But the code conforms to strict `C++17` standards. — Galik, Aug 18 '18 at 00:51
@Slava ok, take just the comp func, qsort and the array... I need to understand the way the array is sorted. And, by the way, if I could show the whole code, we would spend the evening crying together. — Usin, Aug 18 '18 at 00:55
Back to the question at hand, you have an off by one error at least from the `<=` operator. Change that to `<`. — Major, Aug 18 '18 at 00:56
So are you just looking for a fix for your comparison function? — Galik, Aug 18 '18 at 00:57
@Slava that C++ code very ugly but still C++, every function use it's in C++ standard. — Stargateur, Aug 18 '18 at 00:58
@Major Already corrected... I also fix it in the question, thank you. — Usin, Aug 18 '18 at 01:02
@Galik No, I need to understand what the "comp" function does and why the different results. — Usin, Aug 18 '18 at 01:04
@Usin *I'm converting C++ code to Go* -- What do you do if the C++ code is fundamentally wrong? What would you be "converting"? So maybe this would be the best time to learn `std::sort` and forget the casting, while fixing the bugs? — PaulMcKenzie, Aug 18 '18 at 01:11
@PaulMcKenzie I know, and it's for this reason that I am here to understand what "comp" actually does, and if it does in a wrong way, how to correct it (in order to translate it correctly). — Usin, Aug 18 '18 at 01:18
@Usin ok, and as soon as it is figured out what it does: `std::sort(s_tmp, s_tmp + 200, [](SensorIndex& x, SensorIndex& y) {return std::abs(y.value) < std::abs(x.value);});` -- This is basically the entire sort, comparison and all. — PaulMcKenzie, Aug 18 '18 at 01:23

Galik · Answer 1 · 2020-04-01T06:32:04.370

3

There is a bug in your comparison function. You return an int which means you lose the distinction between element values whose absolute difference is less then 1!

int comp(const void* a, const void* b)
{
    SensorIndex* x = (SensorIndex*)a;
    SensorIndex* y = (SensorIndex*)b;

    // what about differences between 0.0 and 1.0?
    return abs(y->value) - abs(x->value); 
}

You can fix it like this:

int comp(const void* a, const void* b)
{   SensorIndex* x = (SensorIndex*)a;
    SensorIndex* y = (SensorIndex*)b;

    if(std::abs(y->value) < std::abs(x->value))
        return -1;

    return 1;
}

A more modern (and safer) way to do this would be to use std::vector and std::sort:

// use a vector for dynamic arrays
std::vector<SensorIndex> s_tmp;

for(int i = 0; i < 200; ++i) {
    s_tmp.push_back({q[i], i});
}

// use std::sort
std::sort(std::begin(s_tmp), std::end(s_tmp), [](SensorIndex const& a, SensorIndex const& b){

    return std::abs(b.value) < std::abs(a.value);
});

edited Apr 01 '20 at 06:32

answered Aug 18 '18 at 01:03

Galik

47,303
4
80
117

`abs` destroys that distinction before they even get around to returning anything; [`abs` accepts and returns `int`s, not `double`s](https://linux.die.net/man/3/abs), so the data is being truncated before they even get around to comparing it. Your code doesn't fix that, because `8.1` and `8.9` would both be converted to `8` by `abs`, so they'd still appear to be equal. – ShadowRanger Aug 18 '18 at 01:12
1

@ShadowRanger Okay, it works for me because in `C++17` `std::abs` is overloaded for doubles...! – Galik Aug 18 '18 at 01:18
@ShadowRanger Actually the float overloads were added in `C++11`. – Galik Aug 18 '18 at 01:21
The overloads were added in C++11, but only as part of `cmath`; they weren't added as part of `cstdlib` (where the integer based versions of the function come from) until C++17, so if you weren't careful to include it, the only `abs` which would exist is integer based. Implementers weren't always careful to follow the spec in this case, [so the whole thing is a giant mess](https://stackoverflow.com/q/21392627/364696), and the OP is testing with C++98, so they're definitely in the "royal mess" era. – ShadowRanger Aug 18 '18 at 01:39
Thanks for the answer, it was useful and complementary to the @ShadowRanger one, but given the details explained I accept his answer. – Usin Aug 18 '18 at 10:41

ShadowRanger · Accepted Answer · 2018-08-18T01:50:07.160

This behavior is caused by using abs, a function that works with int, and passing it double arguments. The doubles are being implicitly cast to int, truncating the decimal component before comparing them. Essentially, this means you take the original number, strip off the sign, and then strip off everything to the right of the decimal and compare those values. So 8.123 and -8.9 are both converted to 8, and compare equal. Since the inputs are reversed for the subtraction, the ordering is in descending order by magnitude.

Your cpp.sh output reflects this; all the values with a magnitude between 8 and 9 appear first, then 3-4s, then 2-3s, 1-2s and less than 1 values.

If you wanted to fix this to actually sort in descending order in general, you'd need a comparison function that properly used the double-friendly fabs function, e.g.

int comp(const void *a, const void* b)
{   SensorIndex* x = (SensorIndex*)a;
    SensorIndex* y = (SensorIndex*)b;
    double diff = fabs(y->value) - fabs(x->value);
    if (diff < 0.0) return -1;
    return diff > 0;
}

Update: On further reading, it looks like std::abs from <cmath> has worked with doubles for a long time, but std::abs for doubles was only added to <cstdlib> (where the integer abs functions dwell) in C++17. And the implementers got this stuff wrong all the time, so different compilers would behave differently at random. In any event, both the answers given here are right; if you haven't included <cmath> and you're on pre-C++17 compilers, you should only have access to integer based versions of std::abs (or ::abs from math.h), which would truncate each value before the comparison. And even if you were using the correct std::abs, returning the result of double subtraction as an int would drop fractional components of the difference, making any values with a magnitude difference of less than 1.0 appear equal. Worse, depending on specific comparisons performed and their ordering (since not all values are compared to each other), the consequences of this effect could chain, as comparison ordering changes could make 1.0 appear equal to 1.6 which would in turn appear equal to 2.5, even though 1.0 would be correctly identified as less than 2.5 if they were compared to each other; in theory, as long as each number is within 1.0 of every other number, the comparisons might evaluate as if they're all equal to each other (pathological case yes, but smaller runs of such errors would definitely happen).

Point is, the only way to figure out the real intent of this code is to figure out the exact compiler version and C++ standard it was originally compiled under and test it there.

So in a nutshell, the original function was fundamentally wrong, but still tried to order the elements from the major to the minor as I thought? So... can I consider it a terrible logical mistake? — Usin, Aug 18 '18 at 01:34
@Usin: I just updated the answer. It's unclear *what* the original function was trying to do. The implementation I originally described (seen under C++98 using integer based `abs`) would be internally consistent, but strange (grouping consistently by the integer component of the absolute value). If it was compiled with a `double` friendly `std::abs`, it would be very, very wrong, in terrible and unpredictable ways, thanks to the problem Galik pointed out. In neither case would there be a reliable absolute ordering from highest to lowest magnitude. — ShadowRanger, Aug 18 '18 at 01:54
@Galik's suggested replacement using `std::sort` and a lambda function would get a reliable ordering from highest to lowest magnitude on C++17 if you `#include`-ed ``, and on C++11 and higher if you include ``; in all cases, to avoid ambiguity, you'd want to explicitly namespace qualify `std::abs`, to avoid any confusion with the C `abs` function (that works solely with `int`). — ShadowRanger, Aug 18 '18 at 01:57
Thanks for the answer, I'm on C++17 compiler but surely a pre-C++17 compiler was originally used (the code is from 2011-2012), so, unless the intent it was not truncating the decimals (and I do not believe), I'm led to think it's a logic error and since this function is used within a much more complex system, the error could also have escaped without visible repercussions in the output. — Usin, Aug 18 '18 at 09:33

Qsort comparison

2 Answers2