9

The function

template <typename Container, typename Comparator, typename Predicate>
void sortButKeepSomeFixed (Container& c, const Comparator& comp, const Predicate& pred)

is to sort the container c according to the ordering criterion comp, but those elements that satisfy pred shall remain fixed in their original positions after the sort (i.e. unaffected by the sort).

I tried to adapt quick sort to fit this, but could not think of it. In the end, I decided to adapt the crude selection sort to get the job done:

#include <iostream>
#include <vector>

std::vector<int> numbers = {5,7,1,8,9,3,20,2,11};

template <typename Container, typename Comparator, typename Predicate>
void sortButKeepSomeFixed (Container& c, const Comparator& comp, const Predicate& pred) {  // O(n^2), but want O(nlogn) on average (like quick sort or merge sort)
    const std::size_t N = c.size();
    std::size_t i, j, minIndex;
    for (i = 0; i < N-1; i++) {
        if (pred(c[i]))
            continue;  // c[i] shall not swap with any element.
        minIndex = i;
        for (j = i + 1; j < N; j++) {
            if (pred(c[j]))
                continue;  // c[j] shall not swap with any element.
            if (comp(c[j], c[minIndex]))
                minIndex = j;
        }
        if (minIndex != i)
            std::swap(c[i], c[minIndex]);
    }
}

int main() {
    sortButKeepSomeFixed (numbers,
        std::greater<int>(),  // Ordering condition.
        [](int x) {return x % 2 == 0;});  // Those that shall remain fixed.
    for (int x : numbers) std::cout << x << ' ';  // 11 9 7 8 5 3 20 2 1
}

But the time complexity is O(N^2) (I think). Can someone improve on the time complexity here, to perhaps O(NlogN) on average? In other words, find an overall better algorithm, using recursion or something like that?

Or perhaps a better idea is to take out the elements that satisfy pred, sort what left with std::sort and then put the extracted elements back in their original positions? Would that be any more efficient, or would that just make it worse?

Update: This is based on Beta's suggestion (sorting the iterators that don't pass pred). But though the elements that pass pred do indeed remain fixed, the sorting at the end is not correct.

template <typename Container, typename Comparator, typename Predicate>
void sortButKeepSomeFixed (Container& c, const Comparator& comp, const Predicate& pred) {
    std::vector<typename Container::iterator> iterators;
    for (typename Container::iterator it = c.begin();  it != c.end();  ++it) {
        if (!pred(*it))
            iterators.emplace_back(it);
    }
    std::vector<typename Container::iterator> originalIterators = iterators;
    std::sort(iterators.begin(), iterators.end(),
        [comp](const typename Container::iterator& x, const typename Container::iterator& y)
        {return comp(*x, *y);});
    for (int i = 0; i < originalIterators.size(); i++)
        *originalIterators[i] = *iterators[i];
}

The incorrect output is 11 9 9 8 11 3 20 2 9 when it should be 11 9 7 8 5 3 20 2 1.

prestokeys
  • 4,817
  • 3
  • 20
  • 43
  • 4
    Your "better idea" will work quite well, O(N logN). – Beta Aug 15 '15 at 02:06
  • O(N logN) for `std::sort` itself, but then the time spent on removing the elements and then placing them back in their original positions adds quite a bit, won't it? – prestokeys Aug 15 '15 at 02:08
  • If you use something like a `vector` but not handle inserting/removing elements yourself, insertions/removals have amortised complexity of `O(N)` so it's fine. – mostruash Aug 15 '15 at 02:09
  • Actually no, sorry. You need to copy things around if you want to do it in place, e.g. if you don't create a copy of the original array. If you do removals one by one, each insertion/removal would be `O(N)` so total complexity would be O(N^2). I suggest you to create a new array for those who don't pass the predicate if running performance is your priority, memory complexity is `O(N)` instead of `O(1)`. – mostruash Aug 15 '15 at 02:12
  • Good point, but coming up with an algorithm that is better than O(N^2) for *every possible* container type is too much to ask. If we assume a `vector`, how about creating a vector of iterators to those elements that do not satisfy pred (O(N)), then sorting that with an appropriate comparison function (O(NlogN)), then swapping elements in the original vector to match (O(N)) for a combined complexity of O(NlogN)? – Beta Aug 15 '15 at 02:49
  • @ Beta I posted an attempt at your implementation idea, but the output is not correct. Or did you mean something else? – prestokeys Aug 15 '15 at 02:59
  • If you use `std::list` instead of `std::vector`, I think it is fine to remove element. – DMaster Aug 15 '15 at 07:41
  • As a side note: I'd pass comparator and predicate by value, as the STL does itself. When you need reference semantics, you can use `std::ref`. As for the incorrect output, could you show it? – Daniel Jour Aug 15 '15 at 07:47
  • If asympthotic complexity is really your most important goal, I'd say removing and inserting the fixed elements via a separate array is the way to go. You could also write a filter iterator that masks out all elements that match pred. But doing this in a way such that the iterator remains a random access one is a bit tricky (but possible) – MikeMB Aug 15 '15 at 15:32
  • 1
    Sometimes it should be asked *"Why?"* – Luca Aug 15 '15 at 17:25
  • @ Luca I ran into a rare case in my program where I needed to sort a container of containers according to their sizes, but the empty containers had to remain fixed for certain reasons. – prestokeys Aug 15 '15 at 18:20

2 Answers2

1

That's a fun one. I first tried to code the IMO correct approach, using a custom iterator that just skips elements that satisfy the predicate. This turned out to be quite challenging, at least writing that on a mobile phone as I'm doing it.

Basically, this should lead to code similar to what you can find in Eric Niebler's ranges v3.

But there's also the simpler, direct approach that you're trying to use above. The problem of your non working solution is, that it's changing the values the (rest of the sorted) iterators point to when assigning in that last for loop. This issue can be avoided by having a copy, like in my code:

int main(int, char **) {
 vector<int> input {1,2,3,4,5,6,7,8,9};
 vector<reference_wrapper<int>> filtered{begin(input), end(input)};
 filtered.erase(remove_if(begin(filtered), end(filtered),
         [](auto e) {return e%2==0;}), end(filtered));
 vector<int> sorted{begin(filtered), end(filtered)};
 // change that to contain reference wrappers to see the issue
 sort(begin(sorted), end(sorted),
      greater<int>{});
 transform(begin(filtered), end(filtered),
    begin(sorted),
    begin(filtered),
    [](auto to, auto from) {
      to.get() = from; return to;});
 copy(begin(input), end(input),
      ostream_iterator<int>{cout, ", "});
 return 0;
}

Live example here. Forgot to fork before modifying, sorry.

(Instead of using copies at last for types that are using heap allocated data move should probably be used. Though I'm not sure whether you can assign to a moved from object.)

Using a ... rather weird ... wrapper class instead of the std::reference_wrapper makes it possible to achieve the filtered sorting without having to use a vector with (copied or moved) elements of the value type:

template <class T>
class copyable_ref {
public:
  copyable_ref(T& ref) noexcept
  : _ptr(std::addressof(ref)), _copied(false) {}
  copyable_ref(T&&) = delete;
  copyable_ref(const copyable_ref& x) noexcept
  : _ptr (new int(*x._ptr)), _copied (true) {
  }
  ~copyable_ref() {
    if (_copied) {
      delete _ptr;
    }
  }
  copyable_ref& operator=(const copyable_ref& x) noexcept {
    *_ptr = *x._ptr;
  }
  operator T& () const noexcept { return *_ptr; }
  T& get() const noexcept { return *_ptr; }
private:
  T* _ptr;
  bool _copied;
};

Upon construction this class stores a pointer to it's argument, which is also modified when the copy assignment operator is used. But when an instance is copy constructed, then a heap allocated copy of the referenced (by the other) value is made. This way, it's possible to swap two referenced values with code similar to

Value a, b;
copyable_ref<Value> ref_a{a}, ref_b{b};
copyable_ref<Value> temp{ref_a};
ref_a = ref_b;
ref_b = temp;
// a and b are swapped

This was necessary because std::sort doesn't seem to use swap (found through ADL or std::swap) but code equivalent to the one above.

Now it's possible to sort a filtered "view" by filling a vector with (not copy constructed) instances of the weird wrapper class and sorting that vector. As the output in the example is showing, there's at most one heap allocated copy of a value type. Not counting the needed size for the pointers inside of the wrapper, this class enables filtered sorting with constant space overhead:

 vector<int> input {1,2,3,4,5,6,7,8,9};

 vector<copyable_ref<int>> sorted;
 sorted.reserve(input.size());
 for (auto & e : input) {
    if (e % 2 != 0) {
      sorted.emplace_back(e);
    }
 }
 sort(begin(sorted), end(sorted),
      greater<int>{});
 copy(begin(input), end(input),
      ostream_iterator<int>{cout, ", "});
 cout << endl;
 // 9 2 7 4 5 6 3 8 1

Finally, while this works quite well, I probably wouldn't use this in production code. I was especially surprised that std::sort wasn't using my own swap implementation, which led to this adventurous copy constructor.


You cannot generalise your code to work for sets and maps: Those are sorted by design, and they need that fixed order to function properly. And the unordered variants are, well, unordered and thus cannot maintain an order. But you can always (as long as you don't modify the container) use std::reference_wrappers inside of a vector to provide a sorted "view" of your data.

Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • @ Daniel Jour Thanks for fixing my original iterator attempt. As for your custom iterator attempt, do you think it can be fixed? It looks very intriguing and I will try to fix it myself if you think it's fixable (it would be the "coolest" solution). – prestokeys Aug 15 '15 at 16:02
  • As far as I can see, you already did that in your own answer. This is basically what I did. What bothers me I'd that there's a copy necessary. This shouldn't be like that, I'm trying to fix that. – Daniel Jour Aug 15 '15 at 18:00
  • I know, I'm not yet ready. Unfortunately, I didn't fork, so the previous one is lost for the moment. – Daniel Jour Aug 15 '15 at 18:45
  • @ Daniel Jour You might be interested in this further generalization of the problem I came up with. Basically, those that satisfy pred will not only remain in their positions, but are then sorted by some other comparator in turn. And then further generalized to any number of preds and comparators. I solved it myself, but perhaps all the copying of elements will irk you enough to come up with a more elegant solution: http://ideone.com/ve91PU – prestokeys Aug 15 '15 at 21:20
0

Based on Beta's idea to sort using iterators, though I'm not sure what the time-complexity is. It also does not work with all containers, e.g. std::set, std::map.

template <typename Container, typename Comparator, typename Predicate>
void sortButKeepSomeFixed (Container& c, const Comparator& comp, const Predicate& pred) {
    std::vector<typename Container::value_type> toSort;
    std::vector<typename Container::iterator> iterators;
    for (typename Container::iterator it = c.begin();  it != c.end();  ++it) {
        if (!pred(*it)) {
            toSort.emplace_back(*it);
            iterators.emplace_back(it);
        }
    }
    std::sort(toSort.begin(), toSort.end(), comp);
    for (std::size_t i = 0; i < toSort.size(); i++)
        *iterators[i] = toSort[i];
}

std::vector<int> vector   = {5,7,1,8,9,3,20,2,11};
std::array<int, 9> array = {5,7,1,8,9,3,20,2,11};
std::list<int> list       = {5,7,1,8,9,3,20,2,11};
std::set<int> set         = {5,7,1,8,9,3,20,2,11};
std::map<double, int> map = { {1.5,5}, {1.2,7}, {3.5,1}, {0.5,8}, {5.2,9}, {7.5,3}, {0.1,20}, {1.8,2}, {2.4,11} };

template <typename Container>
void test (Container& container) {
    sortButKeepSomeFixed (container,
        std::greater<int>(),  // Ordering condition.
        [](int x) {return x % 2 == 0;});  // Those that shall remain fixed.
    for (int x : container) std::cout << x << ' ';
    std::cout << '\n';
}

int main() {
    test(vector);  // 11 9 7 8 5 3 20 2 1
    test(array);  // 11 9 7 8 5 3 20 2 1
    test(list);  // 11 9 7 8 5 3 20 2 1
    test(set);  // Does not compile.
    sortButKeepSomeFixed (map,
        [](const std::pair<double, int>& x, const std::pair<double, int>& y) {return x.second > y.second;},
        [](const std::pair<double, int>& x) {return x.second % 2 == 0;});
    for (const std::pair<double, int>& x : map)
        std::cout << "(" << x.first << "," << x.second << ") ";  // Does not compile.
}

Error for set and map is "Assignment of read-only location". Anyone know how to generalize this to work with sets and maps?

Update: So I suggest for set, maps, etc..., simply remove those elements that satisfy pred and the create a new set/map/... with Compare as their key_compare type. Like below. But it is only for set. How to generalize it to other containers that have key_compare types?

template <typename Container, typename Comparator, typename Predicate>
std::set<typename Container::value_type, Comparator, typename Container::allocator_type>
        sortButRemoveSomeElements (Container& c, const Comparator&, const Predicate& pred) {
    std::set<typename Container::value_type, Comparator, typename Container::allocator_type> set;
    std::vector<typename Container::value_type> keep;
    for (typename Container::iterator it = c.begin();  it != c.end();  ++it) {
        if (!pred(*it))
            keep.emplace_back(*it);
    }
    for (typename Container::value_type x : keep)
        set.emplace(x);  // Sorted by Comparator automatically due to std::set's insertion property.
    return set;
}

The test:

struct GreaterThan { bool operator()(int x, int y) const {return x > y;} };
std::set<int, GreaterThan> newSet = sortButRemoveSomeElements (set,
    GreaterThan{},  // Ordering condition.
    [](int x) {return x % 2 == 0;});  // Those that shall be removed.
for (int x : newSet) std::cout << x << ' ';  // 11 9 7 5 3 1
prestokeys
  • 4,817
  • 3
  • 20
  • 43
  • Aside: this implementation favors the case where most elements should be unaffected. Other alternatives are to copy back, re-evaluating `Pred` to find which elements need to be skipped over (which my gut says is the best general-purpose version), or keeping iterators to the places that need to be skipped (which favors the case that `Pred` is expensive). –  Aug 15 '15 at 15:06
  • 1
    It really doesn't make sense to sort maps and sets. Because they are already sorted by element itself in the case of sets and by key in the case of maps. That's why we use maps and sets, always keep them sorted so anytime we need to find/insert/remove an element, it's always `O(logN)`. – mostruash Aug 15 '15 at 15:22
  • Ok, then I suggest for set, maps, etc..., simply remove those elements that satisfy pred and the create a new set/map/... with Compare as their key_compare type. I've written a function that works to sets in this regard, but generalizing? – prestokeys Aug 15 '15 at 17:21