0

I'm trying to obtain the sorted position of a string in a list that may contain duplicates.

I don't care about an undefined order for duplicates but I would like a global sort.

Here is my best attempt so far:

#include <iostream>
#include <vector>
#include <numeric>
#include <algorithm>

void display(const std::vector<int> &array)
{
  for (const auto & value : array)
    std::cout << value << " ";
  std::cout << std::endl;
}

std::vector<int> sortIndexes(const std::vector<std::string> &values)
{
  std::vector<int> indexes(values.size());
  std::iota(indexes.begin(), indexes.end(), 0);

  std::stable_sort(indexes.begin(), indexes.end(), [&values](const size_t first, const size_t second)
  {
    return values.at(first) <= values.at(second);
  });

  return indexes;
}

int main (void)
{
  display(sortIndexes({"b", "a", "c"})); // Expecting: 1 0 2           Result: 1 0 2 
  display(sortIndexes({"c", "c", "a"})); // Expecting: 1 2 0 or 2 1 0  Result: 2 1 0
  display(sortIndexes({"c", "a", "c"})); // Expecting: 1 0 2 or 2 0 1  Result: 1 2 0
  return 0;
}

Is there another way to get the expected output ?

EDIT:

I was missing the strict comparison + 'inverseIndexes' part to solve my problem. Here is the updated code:

#include <iostream>
#include <vector>
#include <numeric>
#include <algorithm>

void display(const std::vector<int> & array)
{
  for (const auto & value : array)
    std::cout << value << " ";
  std::cout << std::endl;
}

std::vector<int> inverseIndexes(const std::vector<int> & indexes)
{
  std::vector<int> inverse(indexes.size());
  for (size_t i = 0; i < indexes.size(); ++i)
    inverse[indexes[i]] = i;
  return inverse;
}

std::vector<int> sortIndexes(const std::vector<std::string> & values)
{
  std::vector<int> indexes(values.size());
  std::iota(indexes.begin(), indexes.end(), 0);

  std::stable_sort(indexes.begin(), indexes.end(), [&values](const size_t first, const size_t second)
  {
    return values.at(first) < values.at(second);
  });

  return indexes;
}

int main (void)
{
  display(inverseIndexes(sortIndexes({"b", "a", "c"}))); 
  display(inverseIndexes(sortIndexes({"c", "c", "a"})));
  display(inverseIndexes(sortIndexes({"c", "a", "c"})));
  return 0;
}
crep4ever
  • 93
  • 1
  • 5
  • Why do you expect `1 2 0` to be one of the possibilities for the second call? Why do you expect `2 0 1` to be one of the possibilities for the third call? – Caleth Oct 08 '18 at 18:30
  • `std::stable_sort` is doing the correct thing with `<`, you just have wrong expectations. The behaviour when you use `<=` is undefined. It is a shame that you get a result *at all*, instead of a loud error. – Caleth Oct 08 '18 at 18:39
  • [See for example](https://coliru.stacked-crooked.com/a/f66bd12b0948e50d) – Caleth Oct 08 '18 at 18:41
  • I expect 0 to correspond to the 'smaller' character and I expect an undefined behavior for duplicates. In your example, that would result in: an undefined sequence between [8; 20] for all 'c', 8 for 'b' and an undefined sequence between [0; 7] for 'a' – crep4ever Oct 09 '18 at 07:28
  • 1
    `values[0]` is not `a`, so why should it be included in the `a` range? `values[indexes[0]]` **is** `a` – Caleth Oct 09 '18 at 07:36
  • Sorry, in the example you provided, the result is fine and what I am looking for. However, I don't understand the result of '{"c", "c", "a"}' that is '2 0 1' => why don't we have '1 2 0' ? – crep4ever Oct 09 '18 at 07:43
  • Again, you have to look at `values[indexes[n]]` not `values[n]`. The `a` is the final element, at position 2, then you have the increasing sequence for the positions of the `c`s. What you expect dereferences to `c a c` – Caleth Oct 09 '18 at 08:15

1 Answers1

0

The comparison function for std::stable_sort should be a strict "less than", not "less or equal than". So, you only need to fix this line:

    return values.at(first) < values.at(second);
Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • but in this case, the output for {"c", "c", "a"} is 2 0 1 ... I think that I basically need a sort that does not require a strict weak ordering. – crep4ever Oct 08 '18 at 16:14
  • 1
    Those are the indices of the correctly (and stably) sorted array: 2 (`a`), 0 (first `c`), 1 (second `c`). If you want the inverse of that (it seems you do), just compute it: `inverseIndices[indices[i]] = i;` – Max Langhof Oct 08 '18 at 16:22
  • I don't understand why 'a' is considered 'bigger' (2) than the two 'c' (0 and 1) – crep4ever Oct 09 '18 at 07:31
  • 1
    You misunderstand what the sort does. The current output says "to get a sorted array, take the string at index 2 ('a'), then the string at index 0 ('c'), then the string at index 1 ('c')". You seem to want something else: You want an output that says "The first string ('c') would be in position 1 of the sorted array, the second string ('c') would be in position 2 and the third string ('a') would be in position 0". Do you see the difference? As mentioned, you only have to invert the indices: `inverseIndices[indices[i]] = i;` – Max Langhof Oct 09 '18 at 07:43
  • That was it ! Thank you ! – crep4ever Oct 09 '18 at 07:56