10

I often find myself wanting to create a comparator objects for a struct or class which simply extracts one member of the class and does the usual < comparison on that.

For example:

struct student {
   int id;
   std::string name;
};

// sort by ID
std::sort(students.begin(), students.end(), [](const student& l, const student& r){ return l.id < r.id; });

There's a lot of boilerplate there, in particular because we have to repeat the declaration for l and r. Is there a way in the standard library to create a comparator based on an "extractor" function which returns an object to compare on?

Something like:

std::sort(students.begin(), students.end(), compare_on([](const student& s){ return s.id; });

I'm using C++11, but also interested if there are solutions in later standards that don't apply in C++11 (so I can add something to my "reasons to upgrade" list).

I'm asking here about using a single member as the comprand, and also the default comparison "less than", but bonus points for techniques that compose easily, e.g. that allow you use two fields in lexicographic order, or to change the comparison operator.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • 1
    A term I've heard for this case is a *projection* (would look like `sort(students.begin(), students.end(), &student::id)`). The standard library doesn't support it – Justin Sep 05 '18 at 18:30
  • With C++14 you can reduce the pain *a bit* by changing `[](const student& l, const student& r)` to `[](const auto& l, const auto& r)`. – Jesper Juhl Sep 05 '18 at 18:35
  • @JesperJuhl - yup, good point, it reduces the declaration boilerplate, but it doesn't get rid of the main problems that I see: you still need to declare both left and right, and implement the comparison logic yourself. Since comparison functions are generally symmetric in their arguments, the first seems redundant. Implementing the "logic" yourself isn't too bad when it's just `l.f < r.f`, but consider if you wanted to compare on two fields, then you are left writing some abomination like `l.f1 < r.f1 || (l.f1 == r.f1 && l.f2 < r.f2)` etc. – BeeOnRope Sep 05 '18 at 18:42
  • Related: https://codereview.stackexchange.com/q/169887 – Justin Sep 05 '18 at 18:42
  • 1
    @BeeOnRope I wasn't trying to solve the entire problem. Just the easy bit ;-) – Jesper Juhl Sep 05 '18 at 19:15
  • @Jesper - thanks makes sense! As it turns out it's actually the hard bit for me as I'm stuck on c++11 at the moment. – BeeOnRope Sep 05 '18 at 19:21
  • 1
    @BeeOnRope: `std::tie` might help with several members: `return std::tie(l.f1, l.f2) < std::tie(r.f1, r.f2);` and you might even write lambda/function `as_tuple` to avoid to repeat `.f1, .f2`. – Jarod42 Sep 07 '18 at 17:47

4 Answers4

7

What you're effectively looking for is to allow projections to be passed into algorithms. N4128 proposed this for the standard library, and C++20 will have these for many of the algorithms.

But until then, we can do this ourselves. Write a new overload of sort:

struct identity {
    template <typename T>
    T&& operator()(T&& t) const noexcept { return std::forward<T>(t); }
};

// because no std::less<> in C++11 yet
struct less {
    template <typename T, typename U>
    constexpr bool operator()(T const& lhs, U const& rhs) const {
        return lhs < rhs;
    }
};

template <typename Range, typename Comp=less, typename Proj=identity>
void sort_proj(Range& range, Comp comp={}, Proj proj={}) {
    using std::begin;
    using std::end;
    auto first = begin(range), last = end(range);
    using reference = typename std::iterator_traits<decltype(first)>::reference;

    std::sort(first, last,
        [&](reference lhs, reference rhs) {
            return comp(std::ref(proj)(lhs), std::ref(proj)(rhs));
        });
}

std::ref(f)(x) is a trick to get INVOKE functionality in C++11. It basically allows you to pass pointers to members as the projection. This implementation would let you write:

sort_proj(students, less{}, &student::id);

Note that the projection is independent of the ordering. So I could do all kinds of things pretty easily:

sort_proj(students);                            // sort on students, if they're ordered
sort_proj(students, greater{}, &student::name); // decreasing, by name
sort_proj(students, less{},                     // by name, then id
    [](student const& s) {
        return std::tie(s.name, s.id);
    });

This kind of approach is super useful at eliminating a lot of boilerplate from algorithms in general. I have a header that's just full of projection-based overloads of many of the commonly used standard algorithms.

Barry
  • 286,269
  • 29
  • 621
  • 977
4

You can define a utility class

template <class Fct> class compare_on {
  public:
    compare_on(Fct&& get) : get(std::forward<Fct>(get)) {}

    template <class T> bool operator()(const T& lhs, const T& rhs)
    {
       return get(lhs) < get(rhs);
    }

  private:
    Fct get;
};

and then pass it to std::sort just like you depicted it (using C++17 class template argument deduction)

std::sort(students.begin(), students.end(),
    compare_on([](const student& s){ return s.id; }));

As of C++17, @Justin pointed out int the comments that the actual comparison can be improved (with #include <functional>) such that

return std::invoke(get, lhs) < std::invoke(get, rhs);

which allows for an instantiation with data member references:

std::sort(students.begin(), students.end(), compare_on(&student::id));

When bound to C++11, forget about std::invoke and go with an explicit instantiation of compare_on. The latter isn't suitable for lambdas, hence the usual argument-deduction make_* helper:

template <class Fct> auto make_compare_on(Fct&& get)
    -> decltype(compare_on<Fct>(std::forward<Fct>(get)))
{
    return compare_on<Fct>(std::forward<Fct>(get));
}

Note that you can remove the trailing return type in C++14.

As a final note, the naming here should be improved: compare_on is misleading, as it hides what the function object really does - comparing by operator <. Maybe compare_less_then or something similar would be better, or adding another template parameter that can be specified as one of the standard predicates (std::less etc.).

lubgr
  • 37,368
  • 3
  • 66
  • 117
  • 2
    Consider using `std::invoke` to call `get`, which would make this possible: `std::sort(students.begin(), students.end(), compare_on(&student::id))`. In C++11, there are [alternatives](https://stackoverflow.com/q/32918679/1896169) – Justin Sep 05 '18 at 18:35
  • Thanks. It seems like this would be easy to extend, e.g., by passing N rather than 1 extractor method and internally implementing the "lexicographic" comparison `get1(lhs) < get1(rhs) || (get1(lhs) == get1(rhs) && get2(rhs) < get2(rhs) ...)` too. Yes, I agree that in principle the comparison operator itself shouldn't be hardcoded to `<` but also be a template parameter, perhaps defaulting to `std::less`. – BeeOnRope Sep 05 '18 at 18:37
  • @BeeOnRope The class itself does't need to be extended to achive that, instead pass a lamdba that returns a tuple: `compare_on([](const student& s){ return std::make_tuple(s.id, s.name); })`. – lubgr Sep 05 '18 at 18:49
3

With Your suggestion I propose such compare_on:

template<class T>
auto compare_on(T &&t){
    return [t](const auto &l, const auto &r){
        return l.*t < r.*t; 
    };
}
... 
std::sort(students.begin(), students.end(), compare_on(&student::id));

It requires C++14. It uses pointer to member to get exactly the behaviour You asked for.

In C++11, using similar idea it would look like this:

template<class U>
class comparison{
public:
    comparison(const U &ptr) : memberPtr(ptr){}

    template<class T>
    int operator()(const T &l, const T &r){
        return l.*memberPtr < r.*memberPtr;
    }

private:
    U memberPtr;
};

template<class T>
comparison<T> compare_on(T &&t){
    return comparison<T>(std::forward<T>(t));
}

As @Barry suggests, You should replace all the l.*t with std::invoke(t, l) to make it more generic in C++17.

bartop
  • 9,971
  • 1
  • 23
  • 54
  • 2
    Similar to Justin's comment on the other answer (and, in fact, for the inverse reason), use `invoke()` instead of explicitly mandating a pointer to member data (this solution wouldn't even allow a pointer to member function, much less an arbitrary operation). – Barry Sep 05 '18 at 18:45
3

range-v3 implements projection:

// sort by ID
ranges::sort(students, std::less<>{}, &student::id);
Jarod42
  • 203,559
  • 14
  • 181
  • 302