1

I'm studying a recursive mergesort algorithm, and one iterator goes out of bounds. I'm positive the root of my problem is that my algorithm is flawed, but I've spent days pouring over it, and I just don't see my misstep. I don't know what direction to take. Can someone more experienced/smarter than I take a look? (Full program with driver is available on Github here.)

Output is:

before: 50 5 40 10 30 15 20 20 10 25 
after : -1808873259 5 10 10 15 20 20 25 30 40 50 
/*      ^  
 *      Extra recursive call, and out-of-bounds.
 */

To be clear, I am constrained to returning a vector of type T, in this case int, but I'm aware from this post that using a void function is better.

template <typename T>
vector<T> mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    vector<T> newVector;
    if (begin!=end){
        vector<T> tmp1;
        vector<T> tmp2;
        typename vector<T>::iterator mid1 = begin;
        typename vector<T>::iterator mid2 = begin;

        long origDistance = distance(begin,end);
        long endOfRange1 = origDistance/2;
        long begOfRange2 = endOfRange1+1;

        advance(mid1,endOfRange1);
        advance(mid2,begOfRange2);

        tmp1 = mergesort<T>(begin,mid1);
        tmp2 = mergesort<T>(mid2,end);

        //"merge()" is from the STL, link in comments. 
        merge(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end(), back_inserter(newVector));

    } else {
        newVector.push_back(*begin);
    }
    return newVector;
}
Community
  • 1
  • 1
NonCreature0714
  • 5,744
  • 10
  • 30
  • 52
  • Updating soon with output for example. – NonCreature0714 May 10 '16 at 04:34
  • 1
    Are you aware that `some_vector.end()` is not the last element, but one past the last? It's illegal to dereference that. I don't know what the `merge` function does, but that could be the reason. – Stjepan Bakrac May 10 '16 at 04:36
  • The best way to understand why your program behaves the way it does is to run it under a debugger. In the end, you'll learn much more by doing this than you will by asking us to help you figure things out. Good luck! – jdigital May 10 '16 at 04:37
  • Also show how you call this function – M.M May 10 '16 at 04:38
  • Merge is from the STL here: http://www.cplusplus.com/reference/algorithm/merge/?kw=merge – NonCreature0714 May 10 '16 at 04:39
  • I'll mark correct answer tomorrow - it's midnight were I am and I've been looking at this for 8 hours already. (In code's previous iterations.) – NonCreature0714 May 10 '16 at 05:09

4 Answers4

2

You dereference begin when begin == end. That's undefined behavior. Probably you want if (origDistance == 1) then push_back the single element and return.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
1

Your function looks like it could work, if end points to the last element of the vector. However in your sample program you call it like this:

newVector = mergesort<int>(vec.begin(), vec.end());

The vec.end() points past the end of the vector, it doesn't point to the last element. So your function messes up because it ultimately tries to access the element pointed to by the second iterator you pass in.

You could call your function like: mergesort<int>(vec.begin(), vec.end() - 1);.

However this will surprise anyone else reading your code. It would be much better to rewrite your mergesort function to follow normal C++ range conventions, that is, the parameter named end should be past-the-end. mid1 should equal mid2.

M.M
  • 138,810
  • 21
  • 208
  • 365
0

Okay - couldn't go to sleep without figuring this out, and huge credit goes to John Zwinck and M.M for getting me in the right direction - here's the code which got the correct output:

template <typename T>
vector<T> mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    vector<T> newVector;
    long origDistance = distance(begin,end); /*Get distance first.*/

    if (origDistance==1){ /*Added better anchor case checking for distance.*/
        newVector.push_back(*begin);
        return newVector;
    }

    vector<T> tmp1;
    vector<T> tmp2;
    typename vector<T>::iterator mid1 = begin;
    typename vector<T>::iterator mid2 = begin;

    long endOfRange1 = origDistance/2;
    long begOfRange2 = endOfRange1;/*Edited from: endOfRange+1*/

    advance(mid1,endOfRange1);
    advance(mid2,begOfRange2);

    tmp1 = mergesort<T>(begin,mid1);
    tmp2 = mergesort<T>(mid2,end);

    merge(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end(), back_inserter(newVector));
        return newVector;
}
Community
  • 1
  • 1
NonCreature0714
  • 5,744
  • 10
  • 30
  • 52
  • 1
    It might be a good idea to also allow for `begin == end` (i.e. the empty vector as input) – M.M May 10 '16 at 05:58
-3

Here I will show you how to do it.

template <typename T>
void mergesort(typename vector<T>::iterator, typename vector<T>::iterator);

// ...

    mergesort<int>(vec.begin(), vec.end());
    newVector = vec;

// ...

template <typename T>
void mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    auto const N = std::distance(begin, end);
    if (N <= 1) return;                   
    auto const middle = std::next(begin, N / 2);
    mergesort<T>(begin, middle);
    mergesort<T>(middle, end);
    std::inplace_merge(begin, middle, end); 
}

Correct output:

before: 50 5 40 10 30 15 20 20 10 25 
after : 5 10 10 15 20 20 25 30 40 50 

STL already has inplace_merge so why reimplement it? With this approach you don't have to think has hard with boundaries.