1

I was trying to practice C++ iterators by defining the very common mergesort algorithm when I encountered inconsistent program behavior. My question is NOT how to implement mergesort (which I both know and also realize has many example answers), but rather when creating recursive vectors using iterators that I receive inconsistent final results.

I'm aware that I could solve the issue by copying the values into L and R arrays through a for loop, but I would like to understand why using iterators is not working. This is running on CLION with C++ 14. This post Best way to extract a subvector from a vector? did not answer my question, as I'm creating vector similar to methods prescribed.

void merge2(vector<int> &arr, int l, int m, int r)
{
    vector<int> L{arr.begin()+l, arr.begin()+m+1};
    vector<int> R{arr.begin()+m+1,arr.begin()+r+1};
    int i = 0, j = 0, k = l;
    int n1 = m - l + 1;
    int n2 =  r - m;
    while (i < (n1) && j < (n2)){
        if (L[i]<=R[i]){ //R[i] is replaced by R[j]
            arr[k] = L[i++];
        }
        else {
            arr[k] = R[j++];
        }
        k++;
    }
    while  (i < n1)
    {
        arr[k] = L[i];
        i++;
        k++;
    }
    while(j < n2){
        arr[k] = R[j];
        j++;
        k++;
    }
}

/* l is for left index and r is right index of the
   sub-array of arr to be sorted */
void merge2Sort(vector<int> &arr, int l, int r)
{
    if (l < r)
    {
        // Same as (l+r)/2, but avoids overflow for
        // large l and h
//        int m = (l+r-l)/2;
//        int m = l+(r-l)/2;
            int m = (l+r)/2;
        // Sort first and second halves
        merge2Sort(arr, l, m);
        merge2Sort(arr, m+1, r);
        merge2(arr, l, m, r);
    }
}

int main(int argc, char **argv){
    vector<int> arr = {12, 11, 13, 5, 6, 7};
    merge2Sort(arr, 0, arr.size()-1);
    for(auto i = arr.begin(); i != arr.end(); i++){
        cout << *i << " ";
    }
    return 0;
}

Sometimes I receive the correct answer 5 6 7 11 12 13 and other times the incorrect answer 5 6 7 11 13 12. The answer should not vary by attempt.

Correct Answer to reflect answer and comments. Answer corrects indexing error and relies upon iterators. Also noting from comments that vectors when initialized from iterators should use () and not {}.

template <class It>
void merge(It left, It middle, It right)
{
    // Beeing generic, you need to retrieve the type.
    using value_type = typename std::iterator_traits<It>::value_type;
    // You can copy only the first half 
    std::vector<value_type> left_side_copy(left, middle);
    It L = left_side_copy.begin();
    It R = middle;
    while (L != left_side_copy.end()  &&  R != right)
    {
        if ( *L <= *R )
        { 
            *left = *L;
            ++L;
        }
        else {
            *left = *R;
            ++R;
        }
        ++left;
    }
    // Copy only the leftovers, if there are any
    std::copy(L, left_side_copy.end(), left);
}

template <class It> 
void merge_sort(It left, It right)
{
    if (auto dist = std::distance(left, right); dist > 1)
    {
        It middle = left + dist / 2;
        merge_sort(left, middle);
        merge_sort(middle, right);
        merge(left, middle, right);
    }
}

int main(void)
{
    std::vector<int> arr {
        5, 12, 11, 13, 5, 4, 7, 13, 6
    };

    merge_sort(arr.begin(), arr.end());

    for(auto const &i : arr) {
        std::cout << ' ' << i;
    }
    std::cout << '\n';
}
  • This gives an out-of-range error when debugging, which probably causes UB in release mode – Aykhan Hagverdili Jun 24 '19 at 06:05
  • 1
    `vector L{arr.begin()+l, arr.begin()+m+1};` should be `vector L(arr.begin()+l, arr.begin()+m+1);` . The curly braces mean to prefer selecting a constructor based on the contents of the list being the contents of the vector (not an iterator pair). If nothing can be found the first way it falls back to considering iterator pair but you always run the risk of triggering the first way by accident – M.M Jun 24 '19 at 06:25
  • The bug should be very easy to find if you put some dump function for sub array at each mergesort call. – Chen OT Jun 24 '19 at 06:36
  • Thanks M.M, I've changed constructors to use suggested () method instead of {} for vector initialization. – user4242176 Jun 24 '19 at 06:42
  • Thanks, @Ayxan How did you determine that the original uncorrected program was giving out-of-range when debugging? I'm using CLION and there did not appear to be a warning. I would except index of out bounds to throw error. – user4242176 Jun 24 '19 at 06:45
  • @user4242176 VisualStudio gives out-of-range error for `operator[]` in the debug mode – Aykhan Hagverdili Jun 24 '19 at 06:47
  • Okay, looks like that's a VisualStudio feature. I normally have to use .at() to see index out of bounds error in CLION. Interesting. – user4242176 Jun 24 '19 at 06:49
  • 2
    @user4242176 Why did you edit the question to reflect the answer? It makes the answer completely redundant to someone who will see it later. – Gaurav Sehgal Jun 24 '19 at 06:54
  • @Ayxan, I've tried and it does not appear to or at least I am unaware. For example `vector arr = {12,11,13,5,6,7} ; int i2 = arr[99]; ` produces i2 value of 0. In C accessing an array element out of bounds decays to random value from memory and I would expect that behavior, not from C++ though. CLION compiles, runs, and prints out the value 0 for i2. – user4242176 Jun 24 '19 at 07:00
  • BTW, if you are practicing iterators, you should [*use*](https://wandbox.org/permlink/JSTGKOEVTsagLvnf) iterators ;). – Bob__ Jun 24 '19 at 10:04
  • @Bob__ nice and thanks, I've updated answer to reflect link. It's very C++ iterator idiomatic, I'm just guessing you use C++ a lot. ;) – user4242176 Jun 24 '19 at 16:46

2 Answers2

5

You're using index i instead of j in vector R. Replace R[i] with R[j] see below

  void merge2(vector<int> &arr, int l, int m, int r)
    {
        //..
        while (i < (n1) && j < (n2)){
            if (L[i]<=R[j]){ //R[i] is replaced by R[j]
                arr[k] = L[i++];
            }
            else {
                arr[k] = R[j++];
            }
            k++;
        }

   //...
}
shb
  • 5,957
  • 2
  • 15
  • 32
  • Thanks, while this does correct my overlooked error in code, the inconsistent behavior remains. So I don't think it completely solves question. – user4242176 Jun 24 '19 at 06:25
  • @user4242176 are you sure? I'm receiving the correct answer every time. change the index and rebuild the program – shb Jun 24 '19 at 06:28
0

The line int m = l+(r-l)/2; is incorrect. You probably meant int m = (l+r-l)/2; although I think it would be better to keep int m = (l+r)/2;.

Marko Mahnič
  • 715
  • 6
  • 11