0

I have a function that takes two vectors of the same size as parameters :

void mysort(std::vector<double>& data, std::vector<unsigned int>& index)
{
   // For example :
   // The data vector contains : 9.8 1.2 10.5 -4.3
   // The index vector contains : 0 1 2 3
   // The goal is to obtain for the data : -4.3 1.2 9.8 10.5
   // The goal is to obtain for the index : 3 1 0 2
   // Using std::sort and minimizing copies
}

How to solve that problem minimizing the number of required copies ?

An obvious way would be to make a single vector of std::pair<double, unsigned int> and specify the comparator by [](std::pair<double, unsigned int> x, std::pair<double, unsigned int> y){return x.first < y.first;} and then to copy the results in the two original vectors but it would not be efficient.

Note : the signature of the function is fixed, and I cannot pass a single vector of std::pair.

Vincent
  • 57,703
  • 61
  • 205
  • 388
  • Wouldn't a copy be simpler to write and have better time complexity too. – Flexo Nov 28 '12 at 16:33
  • I'm not sure why you would want to have an index that no longer contains valid offsets to a now-sorted array. Is it just to know which slots the elements were in *prior* to the sort? – WhozCraig Nov 28 '12 at 16:37
  • 2
    See: http://stackoverflow.com/questions/1577475/c-sorting-and-keeping-track-of-indexes – Sean Cline Nov 28 '12 at 16:42
  • @WhozCraig: This is called "from-permutation". The problem in this case is to sort the values and also generate a "from-permutation" for the sorted array, i.e. an array of *original* indices. This is a pretty common problem. – AnT stands with Russia Nov 28 '12 at 16:43
  • But why do you need to sort both of them? doesn't `index` contain indecies of the elements in `data`? If that's the case than you only need to rearrange elements in 'index' and keep 'data' as is. –  Nov 28 '12 at 16:43
  • I'd say there's a better answer than those below found in http://stackoverflow.com/questions/13248149/sort-and-keep-track-of-elements/ – Zane Nov 28 '12 at 16:49
  • 2
    is `index` guaranteed to be `[0,1,2,...]`, or was that just an example? – AShelly Nov 28 '12 at 16:49
  • @Ashelly: good point - that would make this much easier, right? – Zane Nov 28 '12 at 16:50
  • @AndreyT yeah I kinda figured, was just making sure. I answered a similar question in C by loading an array with element addresses, sorting that, then pointer-differencing to generate the permutation vector. As a bonus (if you want to call it that) the original array was left untouched. Looks like he wants the permutation vector *and* the results sorted in this case. – WhozCraig Nov 28 '12 at 16:52
  • index is not guaranteed to be [0, 1, 2...]. In general it's the case but not all the time. Do you know where I can find the code of std::sort in the gcc standard library ? – Vincent Nov 28 '12 at 17:37

5 Answers5

6

Inside the function, make a vector positions = [0,1,2,3...] Sort positions with the comparator (int x, int y){return data[x]<data[y];}.

Then iterate over positions , doing result.push_back(index[*it]);

This assumes the values in index can be arbitrary. If it is guaranteed to already be [0,1,2..] as in your example, then you don't to make the positions array, just use index in it's place and skip the last copy.

AShelly
  • 34,686
  • 15
  • 91
  • 152
2

http://www.boost.org/doc/libs/1_52_0/libs/iterator/doc/index.html#iterator-facade-and-adaptor

Write a iterator over std::pair<double&, signed int&> that actually wraps a pair of iterators into each vector. The only tricky part is making sure that std::sort realizes that the result is a random access iterator.

If you can't use boost, just write the equivalent yourself.

Before doing this, determine if it is worth your bother. A zip, sort and unzip is easier to write, and programmer time can be exchanged for performance in lots of spots: until you konw where it is optimally spent, maybe you should just do a good-enough job and then benchmark where you need to speed things up.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
1

You can use a functor class to hold a reference to the value array and use it as the comparator to sort the index array. Then copy the values to a new value array and swap the contents.

struct Comparator
{
    Comparator(const std::vector<double> & data) : m_data(data) {}
    bool operator()(int left, int right) const { return data[left] < data[right]; }
    const std::vector<double> & m_data;
};

void mysort(std::vector<double>& data, std::vector<unsigned int>& index)
{
    std::sort(index.begin(), index.end(), Comparator(data));
    std::vector<double> result;
    result.reserve(data.size());
    for (std::vector<int>::iterator it = index.begin(), e = index.end();  it != e;  ++it)
        result.push_back(data[*it]);
    data.swap(result);
}
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
1

You can use a custom iterator class, which iterates over both vectors in parallel. Its internal members would consist of

  1. Two references (or pointers), one for each vector
  2. An index indicating the current position

The value type of the iterator should be a pair<double, unsigned>. This is because std::sort will not only swap items, but in some cases also temporarily store single values. I wrote more details about this in section 3 of this question.

The reference type has to be some class which again holds references to both vectors and a current index. So you might make the reference type the same as the iterator type, if you are careful. The operator= of the reference type must allow assignment from the value type. And the swap function should be specialized for this reference, to allow swapping such list items in place, by swapping for both lists separately.

Community
  • 1
  • 1
MvG
  • 57,380
  • 22
  • 148
  • 276
-1

This should do it:

std::sort(index.begin(), index.end(), [&data](unsigned i1, unsigned i2)->bool
{ return data[i1]<data[i2]; });

std::sort(data.begin(), data.end());
Andrei Tita
  • 1,226
  • 10
  • 13