2

I am trying to create a merge sort algorithm using c++ iterators. I only want to call my sort function on a vecotor of ints without passing in any indexes.

void MergeSort(std::vector<int> &vector)
{
    std::vector<int> numbers = vector;

    merge(numbers, vector, vector.begin(), vector.end());
}

Is my sort initiator and my merge function is:

void merge(std::vector<int> &vector, std::vector<int> &result, std::vector<int>::iterator start, std::vector<int>::iterator end)
{
    if ((end - start) < 2)
    {
        return;
    }

    if ((end - start) == 2)
    {
        if (*start > *(start + 1))
        {
            std::iter_swap(start, start + 1);
            return;
        }
    }

    std::vector<int>::iterator mid = result.begin() + result.size() / 2;
    merge(result, vector, vector.begin(), mid);
    merge(result, vector, mid, vector.end());

    std::vector<int>::iterator i = std::next(vector.begin(), start - result.begin());
    std::vector<int>::iterator j = std::next(vector.begin(), mid - result.begin());
    std::vector<int>::iterator idx = start;

    while (idx < end)
    {
        if (j >= end || (i < mid && *i < *j))
        {
            *idx = *i;
            i++;
        }
        else
        {
            *idx = *j;
            j++;
        }
        idx++;
    }
}

I have based my algorithm on the "Algorithms in a nutshell" book. However, when running the code I get segmentation fault meaning I am accessing values out side of my memory. When I run the debugger I noticed that my i value sometimes is negatives or very large.

I believe my usage of iterators is wrong here, however I do not know in what way.

Djboboch
  • 45
  • 1
  • 6
  • `j >= end` This is also wrong, you are comparing iterators to both vectors, as well as in `i < mid`. I would suggest to use pen and paper and write down what happens in the code for some simple input vector. – Daniel Langr May 20 '20 at 06:42
  • @DanielLangr what change would you suggest ? I am trying to compare the iterator positionsm whats another way to do it ? – Djboboch May 20 '20 at 07:10
  • You can legally compare only iterators that point to the same container. In your case, `i` and `j` point to `vector`, but `mid` and `end` point to `result`. (Basically, you can imagine that those iterators are just pointers to arrays. How could you compare two pointers pointing to elements in two different arrays?) – Daniel Langr May 20 '20 at 07:19
  • In the case of Visual Studio, you can use something like`std::vector *vptr = (std::vector *)i._Getcont(); `, to get a pointer to an iterators container. Then something like `std::vector::iterator bgn = (*vptr).begin().` I don't know about other compilers. – rcgldr May 20 '20 at 16:53

4 Answers4

0

Problem is here

merge(result, vector, vector.begin(), mid);
merge(result, vector, mid, vector.end());

That should be

merge(result, vector, start, mid);
merge(result, vector, mid, end);
john
  • 85,011
  • 4
  • 57
  • 81
  • Not only, `mid` need to be calculated with respect to `start` and `end` as well. Such as `start + (end - start) / 2`. – Daniel Langr May 20 '20 at 06:12
  • @DanielLangr Yes, there seems to be lots of confusion between iterators to the source and result vectors. – john May 20 '20 at 06:12
0
std::vector<int>::iterator mid = result.begin() + result.size() / 2;

That line appears to be wrong. For recursion level >1, this is constantly choosing the same middle point.

You shouldn't be querying end(), begin() or size() at all in merge, only operate within the iterator range you have been passed as parameter.

Ext3h
  • 5,713
  • 17
  • 43
0

As Daniel Langr noted in comments, you can't compare iterators that originate from different vectors. I suggest you remove std::vector& parameters completely to avoid possible confusion. If you want to use iterators, why do you need them in the first place?

template<class It>
void merge_sort_impl(It first, It last, It buff) {
    const auto n = last - first;
    if (n <= 2) {
        if (n == 2 && *(first + 1) < *first)
            std::iter_swap(first, first + 1);
        return;
    }

    const auto mid       = first + n / 2;
    const auto buff_mid  = buff + n / 2;
    const auto buff_last = buff + n;

    merge_sort_impl(buff, buff_mid, first);
    merge_sort_impl(buff_mid, buff_last, mid);

    auto it = buff_mid;
    while (first != last) {
        auto& src = (it == buff_last || 
            (buff != buff_mid && !(*it < *buff))) ? buff : it;
        *first++ = *src++;
    }
}

template<class It>
void merge_sort(It first, It last) {
    std::vector<typename std::iterator_traits<It>::value_type>
        buff(first, last);
    merge_sort_impl(first, last, buff.begin());
}

Usage example:

std::vector<int> vec{/* init */};
merge_sort(vec.begin(), vec.end());

Demo


Note. One advantage of merge sort (if implemented correctly) is its stability. In your merge step you have a comparison, (i < mid && *i < *j). If *i == *j, you advance j, but to ensure stability you should advance i. For ints it doesn't matter, but it does matter in the general case.

Evg
  • 25,259
  • 5
  • 41
  • 83
0

Example of iterator based optimized top down merge sort that avoids copying of data by alternating the direction of merge with the level of recursion by using two mutually recursive functions (...AtoA, ...AtoB). I left out the function prototypes. Math can be performed on iterators as long as they both point to the same vector. For the entry function MergeSort, ab is iterator to data begin, ae is iterator to data end.

void MergeSort(     typename std::vector<uint64_t>::iterator &ab,
                    typename std::vector<uint64_t>::iterator &ae)
{
    size_t sz = ae - ab;
    if (sz < 2)
        return;
    std::vector<uint64_t> vb(sz);           // temp vector
    std::vector<uint64_t>::iterator bb = vb.begin();
    std::vector<uint64_t>::iterator be = vb.end();
    MergeSortAtoA(ab, ae, bb, be);
}

void MergeSortAtoA( typename std::vector<int>::iterator &ab,
                    typename std::vector<int>::iterator &ae,
                    typename std::vector<int>::iterator &bb,
                    typename std::vector<int>::iterator &be)
{
    size_t sz = ae - ab;
    if(sz < 2)                              // if 1 element return
        return;
    std::vector<int>::iterator am = ab+(sz/2);
    std::vector<int>::iterator bm = bb+(sz/2);
    MergeSortAtoB(ab, am, bb, bm);
    MergeSortAtoB(am, ae, bm, be);
    Merge(bb, bm, be, ab);
}

void MergeSortAtoB( typename std::vector<int>::iterator &ab,
                    typename std::vector<int>::iterator &ae,
                    typename std::vector<int>::iterator &bb,
                    typename std::vector<int>::iterator &be)
{
    size_t sz = ae - ab;
    if(sz < 2){                             // if 1 element, copy it
        *bb = *ab;
        return;
    }
    std::vector<int>::iterator am = ab+(sz/2);
    std::vector<int>::iterator bm = bb+(sz/2);
    MergeSortAtoA(ab, am, bb, bm);
    MergeSortAtoA(am, ae, bm, be);
    Merge(ab, am, ae, bb);
}

void Merge(         typename std::vector<int>::iterator &ab,
                    typename std::vector<int>::iterator &am,
                    typename std::vector<int>::iterator &ae,
                    typename std::vector<int>::iterator &bb)
{
std::vector<int>::iterator mb = ab;         // left  run iterator
std::vector<int>::iterator mm = am;         // right run iterator
std::vector<int>::iterator bi = bb;         // merge run iterator

    while(1){                               // merge data
        if(*mb <= *mm){                     // if mb < am
            *bi++ = *mb++;                  //   copy mb
            if(mb < am)                     //   if not end left run
                continue;                   //     continue (back to while)
            while(mm < ae)                  //   else copy rest of right run
                *bi++ = *mm++;
            break;                          //     and return
        } else {                            // else mb > mm
            *bi++ = *mm++;                  //   copy mm
            if(mm < ae)                     //   if not end of right run
                continue;                   //     continue (back to while)
            while(mb < am)                  //   else copy rest of left run
                *bi++ = *mb++;
            break;                          //     and return
        }
    }
}
rcgldr
  • 27,407
  • 3
  • 36
  • 61