6

Hi I struggled with simple program that is doing lex sorting on the array of the integers

(full program here: https://pastebin.com/UMqEP62n )

Running sort on the big enough vector of ints causes random numbers appear in the array and then depends on the data causing segfault.

sort(nums.begin(), nums.end(), lex_compare);

Function to compare:

bool lex_compare(const int &a, const int &b) {
    int ap = a, bp = b;
    vector<int> da, db;

    if (ap == 0) da.insert(da.begin(), ap%10);
    while (ap > 0) {
        da.insert(da.begin(), ap%10);
        ap /= 10;
    }
    if (bp == 0) db.insert(db.begin(), bp%10);
    while (bp > 0) {
        db.insert(db.begin(), bp%10);
        bp /= 10;
    }
    size_t size;
    if (da.size() < db.size()) {
        size = da.size();
    } else {
        size = db.size();
    }

    for (size_t i = 0; i < size; ++i)
        if (da[i] > db[i])
            return true;
        else if (da[i] < db[i])
            return false;
        else
            continue;

    if (da.size() > db.size())
        return false;
    else
        return true;
}

Basically when array is too long memory corruption shows up as there is heap buffer overflow. When compiled with address Sanitizer it show that:

==4097==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6140000001d0 at pc 0x00010aa396fe bp 0x7ffee51c5c50 sp 0x7ffee51c5c48
READ of size 4 at 0x6140000001d0 thread T0
#0 0x10aa396fd in lex_compare(int const&, int const&) main.cpp:8  

Which is essentially first line of lex_compare function:

int ap = a, bp = b;

I cannot figure out kind of error here which causing this behavior? Is it due to size of compare function? or do I have some obvious memory read error? I work with clang 4.2.1 but other platforms are also causing same behavior so I presume is something fundamental wrong with sort function.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
Mazeryt
  • 855
  • 1
  • 18
  • 37
  • 1
    With a rep of 375, you should know at this point that "pastebin" links shouldn't be used. – PaulMcKenzie Oct 10 '18 at 16:43
  • + input to reprouce this "heap-buffer-overflow" is missing. – Swordfish Oct 10 '18 at 16:47
  • 1
    Why so much of convoluted code instead of `return std::to_string(a) < std::to_string(b);` ? And passing `int` by const reference just to make a local copy on the first line... – Slava Oct 10 '18 at 16:55
  • this might also prove to be useful. https://stackoverflow.com/questions/18291620/why-will-stdsort-crash-if-the-comparison-function-is-not-as-operator – Yucel_K Oct 10 '18 at 17:04
  • from my prior experience the likely issue in a 'comparator heap overflow' error is almost always a mistake in the comparator code. incorrect or missed conditions in the comparator results in the comparator being called repeatedly in effort to sort the array, which never completes, thus resulting in a memory leak. – mouserat Jun 01 '20 at 18:00

1 Answers1

20

Your issue is you violate the compare requirement that the comparison function passed to std::sort is required to follow.

The compare requirement requires that If comp(a,b)==true then comp(b,a)==false but your comparison function does not do this. for example if a = 0 and b = 0 then comp(a, b) brings you to the

if (da.size() > db.size())
    return false;
else
    return true;

part of you comparison function. Since the sizes are equal it returns true. when we flip it and evaluate comp(b, a) we get to the same block and will return true again.

You need to return false when a == b and you don't do that.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • thanks for the help that was something that I wasn't aware of! – Mazeryt Oct 10 '18 at 17:35
  • 1
    @Mazeryt To confirm the answer given, if you run the code you had at the pastebin link in Visual Studio, you get an assertion error that the comparison function is invalid. The test that NathanOliver mentions about flipping the values and doing the test a second time is exactly what Visual C++ debug runtime does to verify you do not have a bogus comparison function. – PaulMcKenzie Oct 10 '18 at 17:49
  • @PaulMcKenzie does this feature has some name? I am not running inside VS but only using clang, so curious if clang can catch this as well? – Mazeryt Oct 10 '18 at 17:55
  • @Mazeryt It is part of MSVC's debug version of their standard library implementation. They've added a lot of "utilities" like this to help coders find problems faster in debug builds. I'm not sure if gcc or clang do that. – NathanOliver Oct 10 '18 at 17:57