28

As the title says, I'm looking for a way to sort a vector without modifying the original one. My first idea is of course to create a copy of the vector before the sort, e.g.:

std::vector<int> not_in_place_sort(const std::vector<int>& original)
{
   auto copy = original;
   std::sort(copy.begin(), copy.end());
   return copy;
}

However, maybe there is a more efficient way to perform the sort using C++ standard algorithm (maybe a combination of sort and transform?)

Emiliano
  • 22,232
  • 11
  • 45
  • 59
  • 7
    Pass the vector by value, so it's local to the function, sort and return. That's the best you can get. – DeiDei Nov 28 '17 at 17:02
  • @JakeFreeman no it's not, it's a fast 'on average' sort, but I think you should revisit sorting algorithms before making such blanket statements – UKMonkey Nov 28 '17 at 17:06
  • 2
    Explain what you mean by "sort". If you want a sorted vector, you can't possibly get away without a copy and sort. – Passer By Nov 28 '17 at 17:07
  • I'm just wondering if there is an algorithm `f(v)` whose time cost is less than `create_copy + std::sort`. If I need to copy the vector, I can do it, just wondering if I'm missing something – Emiliano Nov 28 '17 at 17:10
  • 1
    @Emiliano -- One option -- don't sort the vector itself -- you can sort an index array and use that to index into the vector. – PaulMcKenzie Nov 28 '17 at 17:15
  • @PaulMcKenzie smart :) – Emiliano Nov 28 '17 at 17:17
  • @DeiDei That is not the necessarily "the best". OP's version is more likely to be better for lvalues (only one copy). – juanchopanza Nov 28 '17 at 17:18
  • @Emiliano -- I guess someone already beat me with an answer. However you can get a look at the [answer here](https://stackoverflow.com/questions/46382252/sort-array-by-first-item-in-subarray-c/46382976#46382976) for an explanation of why you would want to use an array of indices. – PaulMcKenzie Nov 28 '17 at 17:21

6 Answers6

20

Use partial_sort_copy. Here is an example:

vector<int> v{9,8,6,7,4,5,2,0,3,1};
vector<int> v_sorted(v.size());
partial_sort_copy(begin(v), end(v), begin(v_sorted), end(v_sorted));

Now, v remains untouched but v_sorted contains {0,1,2,3,4,5,6,7,8,9}.

Erel Segal-Halevi
  • 33,955
  • 36
  • 114
  • 183
  • 1
    This solution is underrated. – Thomas Jul 01 '19 at 12:21
  • 9
    This solution is *overrated*: it provides no explanation as to why `partial_sort_copy` is better than initially copying the vector and calling `std::sort` on the result. It's not a "`partial`" whenever two range lengths are the same. Is it faster? Is it more readable? I would argue the answer is "no" for both. – Seth Johnson Oct 16 '19 at 18:59
  • 2
    @SethJohnson It should be faster by at least the copy pass of the vector. That's what the asker is likely trying to achieve, not copy and then sort but directly write sorted values into destination. `partial` should really say `general` – Jan Hošek Sep 03 '20 at 09:41
13

Here is my favorite. Sort an index and not the original array/vector itself.

#include <algorithm>

int main() {

    int intarray[4] = { 2, 7, 3, 4 };//Array of values
    //or you can have vector of values as below
    //std::vector<int> intvec = { 2, 7, 3, 4 };//Vector of values
    int indexofarray[4] = { 0, 1, 2, 3 };//Array indices

    std::sort(indexofarray, indexofarray + 4, [intarray](int index_left, int index_right) { return intarray[index_left] < intarray[index_right]; });//Ascending order.
    //have intvec in place of intarray for vector.


}

After this, indexofarray[] elements would be 0, 2, 3, 1, while intarray[] is unchanged.

Tryer
  • 3,580
  • 1
  • 26
  • 49
4

As suggested in the comments pass the function argument by value std::vector<int> original:

#include <iostream>
#include <vector>
#include <algorithm>

std::vector<int> not_in_place_sort(std::vector<int> original) {
    std::sort(original.begin(), original.end());
    return original;
}

int main() {
    std::vector<int> v = { 8, 6, 7, 2, 3, 4, 1, 5, 9 };
    std::vector<int> v2 = not_in_place_sort(v); // pass the vector by value
    std::cout << "v1: " << '\n';
    for (auto el : v) {
        std::cout << el << ' ';
    }
    std::cout << "\nv2: " << '\n';
    for (auto el : v2) {
        std::cout << el << ' ';
    }
}

That will sort a copy of your original vector leaving the original intact. As pointed out below this might restrict some optimizations such as RVO but will call vector's move constructor in the return statement instead.

Ron
  • 14,674
  • 4
  • 34
  • 47
  • That inhibits RVO. It isn't a big deal because the return will be moved, but depending on the application, it could matter. – juanchopanza Nov 28 '17 at 17:15
  • How does this prohibit RVO? I think it should work as described. Better yet, `v = not_in_place_sort(std::move(v))` should be very nearly as cheap as `std::sort(v.begin(), v.end())`. – Seth Johnson Oct 16 '19 at 23:43
  • OK interesting, it does prevent RVO. Still, as long as the containers have move constructors, passing by value and returning by value give the most flexibility *and* improve the clarity of the calling code: `sorted(std::move(v))` means assuming `v` will no longer be valid after the funciton, `sorted(v)` means it will still be valid. – Seth Johnson Oct 25 '19 at 14:53
3

For the case where you are interested in proxy sorting (sorting an index list), you may want to implement a more flexible algorithm that allows you to deal with containers which do not support random access (such as std::list). For example:

#include <algorithm>
#include <iostream>
#include <list>
#include <numeric>
#include <vector>

template <typename Container>
auto sorted_indices(const Container& c) {
  std::vector<typename Container::size_type> indices(c.size());
  std::iota(indices.begin(), indices.end(), 0);
  std::sort(indices.begin(), indices.end(), [&c](auto lhs, auto rhs) {
    return (*(std::next(c.begin(), lhs)) < *(std::next(c.begin(), rhs)));
  });
  return indices;
}

template <typename Container, typename Indices>
auto display_sorted(const Container& c, const Indices& indices) {
  std::cout << "sorted: ";
  for (auto&& index : indices) {
    std::cout << *(std::next(c.begin(), index)) << " ";
  }
  std::cout << std::endl;
}

template <typename Container>
auto display_sorted(const Container& c) {
  return display_sorted(c, sorted_indices(c));
}

template <typename Container>
auto display(const Container& c) {
  std::cout << "as provided: ";
  for (auto&& ci : c) std::cout << ci << " ";
  std::cout << std::endl;
}

int main() {
  // random access
  const std::vector<int> a{9, 5, 2, 3, 1, 6, 4};
  display(a);
  display_sorted(a);
  display(a);

  std::cout << "---\n";

  // no random access
  const std::list<int> b{9, 5, 2, 3, 1, 6, 4};
  display(b);
  display_sorted(b);
  display(b);
}

Sample run:

$ clang++ example.cpp -std=c++17 -Wall -Wextra
$ ./a.out
as provided: 9 5 2 3 1 6 4 
sorted: 1 2 3 4 5 6 9 
as provided: 9 5 2 3 1 6 4 
---
as provided: 9 5 2 3 1 6 4 
sorted: 1 2 3 4 5 6 9 
as provided: 9 5 2 3 1 6 4 

As you would expect, relying on proxy sorting could have important performance implications. For example: every time you want to traverse in order, you will possibly incur cache misses. In addition, the traversal will have the same complexity as the underlying container for random access: In the case of std::vector, std::next(v.begin(), n) is O(1), but in the case of std::list, std::next(l.begin(), n) is O(n).

Escualo
  • 40,844
  • 23
  • 87
  • 135
1

For int's it doesn't make much difference if you're sorting an index or making a copy & sorting the copy; the data still needs to be initialized, and in the case of the indexes, this will involve a loop assigning values rather than faster memcpy routines; so may end up slower; in addition you're going to be jumping around memory lots more; so now the cache can't do its job nicely.

For larger objects I'd not sort the index, but use a vector of pointers. The copy of the pointers is cheap compared to copying the objects themselves; the containers are still obvious because they're containing pointers of your object; and the sort isn't attempting to reference another vector.

UKMonkey
  • 6,941
  • 3
  • 21
  • 30
1

You can create another vector to store the indices. Here is the code:

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

int main()
{
    vector<int> numbers = {50,30,20,10,40};
    vector<int> indexOfNumbers;
    
    for(int i = 0; i < numbers.size(); i++)
    {
        indexOfNumbers.push_back(i); 
    }
    // Now, indexOfNumbers = [0,1,2,3,4]

    std::sort(
        indexOfNumbers.begin(), indexOfNumbers.end(), 
        [numbers](int leftIndex, int rightIndex) 
        { 
            return numbers[leftIndex] < numbers[rightIndex]; // sort in ascending order
        }
    );
    // After sorting, indexOfNumbers = [3, 2, 1, 4, 0]

    // Access the sorted elements
    cout << "Accessing the sorted elements : ";
    for(int i = 0; i < numbers.size(); i++)
    {
        cout << numbers[indexOfNumbers[i]] << " ";
    }
    // prints numbers in sorted order i.e. [10,20,30,40,50]
   return 0;
}

Source: Made slight modification according to Tyrer's answer (https://stackoverflow.com/a/47537314)

Puneeth G R
  • 111
  • 1
  • 8