24

I want to implement some generic algorithms and I have a number of ideas how specialized algorithms could be implemented depending on certain traits of entities the algorithm is used with. However, it seems likely that I didn't come up with all special traits and I'd like to implement the generic version such that they could work with another specialized version.

For example, consider distance(begin, end) (yes, I know it is in the standad library; however, it is nice and simple and can be used to demonstrate my problem). A general version could look like this (I'm using std::ptrdiff_t instead of std::iterator_traits<It>::difference_type as another simplification):

template <typename It>
auto distance(It it, It end) -> std::ptrdiff_t {
    std::ptrdiff_t size{};
    while (it != end) {
        ++it;
        ++size;
    }
    return size;
}

Of course, if the iterator type is a random access iterator, it is much better to implement the algorithm using the difference between the two iterators. Naively just adding

template <typename It>
auto distance(It begin, It end)
     -> typename std::enable_if<is_random_access_v<It>, std::ptrdiff_t>::type {
    return end - begin;
}

doesn't quite work: both implementation are equally good matches for random access iterators, i.e., the compiler considers them to be ambiguous. The easy approach to deal with the situation is to change the general implementation to be applicable only for non random access iterators. That is, the SFINAE choices are made such that they are mutually exclusive while also covering the entire space.

Unfortunately, the set of implementations is still closed: without changing the signature of, at least, one of the implementations I can't add another implementation in case I have another idea for a generic implementation taking advantage of special properties. For example, if I want to add special processing for segmented ranges (idea: when the underlying sequence consists of segments as is, e.g., the case for std::deque<...> or std::istreambuf_iterator<cT>, process segments individually) it would be necessary to change the general implementation to be applicable only when the sequences isn't a random access and it isn't a segmented sequence. Of course, if I control the implementation that can be done. A user wouldn't be able to extend the set of generic implementations.

I am aware that the functions can be overloaded for special iterator types. However, that would require that every time an iterator with special capabilities is added it would need to implement the respective functions. The goal is to enable adding generic implementations which areimprovements in case the entities they are used with expose additional facilities. It is similar to the different iterator categories, although the properties are orthogonal to the iterator categories.

Thus, my question is:

  • Can generic algorithms be implemented such that new improvement idea can be added without changing the existing implementations and, if so, how?
  • Optional follow-up (I'm primarily interested in the question above but this one could be interesting, too): If that isn't possible, would this ability be added with concepts?
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • I suppose one example is the compare function you pass in to a generic sort function. Without changing the sort if you pass in a faster compare (eg using hashes instead of string compare) then the overall speed of your program will improve. – Yetti99 Dec 16 '14 at 16:29
  • With Concepts the more constrained version is preferred by overload resolution, so you can overload on a constrained version and an unconstrained base template; but even then the ordering of constraints are only partial, so if you have an iterator that can be both segmented and random access, you are out of luck. – T.C. Dec 16 '14 at 18:03
  • @T.C.: Put differently, Concepts would allow me to do what can be done with category tags already and have exactly the same constraint: if I want to have an orthogonal set of category tags (e.g., every iterator category can have a segmented variation) I'd need to modify the hierarchy (e.g. by templating it with an argument indicating whether it supports segmentation). – Dietmar Kühl Dec 16 '14 at 18:08
  • 1
    @DietmarKühl Well, with Concepts you can write a "random and segmented" version and have that picked by overload resolution without modifying the "random" version. What you can't do is having a "random" and a "segemented" version (without having a "random and segmented" one). – T.C. Dec 16 '14 at 18:42
  • nitpick: `std::distance` returns a signed difference type, so the simplification would be to use `ptrdfiff_t`. – TemplateRex Dec 16 '14 at 18:58

5 Answers5

18

One approach is a ranking-based overload mechanism. Assign each overload a rank and let overload resolution do the rest.
These are the helper traits:

template <unsigned i> struct rank : rank<i - 1> {};

template <> struct rank<0> {};

using call_ranked = rank<256>;

And this is an example usage:

template <typename It>
auto distance_ranked(rank<0>, It it, It end) -> std::size_t {
    std::size_t size{};
    while (it != end) {
        ++it;
        ++size;
    }
    return size;
}

template <typename It>
auto distance_ranked(rank<1>, It begin, It end)
     -> typename std::enable_if<is_random_access_v<It>, std::size_t>::type {
    return end - begin;
}

// Delegating function template:
template <typename... Args>
auto distance(Args&&... args)
    -> decltype(distance_ranked(call_ranked(), std::forward<Args>(args)...)) {
    return      distance_ranked(call_ranked(), std::forward<Args>(args)...);
}

Demo.
A rank with a higher number is more prioritized than one with a lower number. I.e. rank<1> causes the second overload to be selected over the first (rank<0>) if the matches would otherwise be identical.

If you wanted to add a segment-based implementation, use that as the condition for enable_if. Presumably segmented ranges and random-access ranges would be mutually exclusive, but if they aren't, assign the random-access one a higher priority. The general guideline could be: The more efficient an implementation is, the higher its rank.
Using this method, other implementations shouldn't be affected when introducing a new one. One has to make sure though that any two categories with non-empty intersections (that aren't covered by a category with higher rank) have a different rank - which constitutes a noticeable disadvantage.

Columbo
  • 60,038
  • 8
  • 155
  • 203
  • Interesting. Given that I wouldn't know the traits which need to be added at a given rank, clearly the known uses would be separated by gaps (reminds me a bit of ZX Spectrum Basic where you'd number lines 10, 20, etc. so you can slip statements in later without renumbering lines). Segmentation is actually orthogonal to the iterator category: `std::deque<...>` is a segmented random-access sequence. Obviously, for `distance(...)` the segmentation isn't helpful but for other algorithms (`find(...)`, `copy(...)`) processing individual segments matters. – Dietmar Kühl Dec 16 '14 at 18:16
  • @DietmarKühl Yes, once the "hierarchies" get more complicated it's feasible to leave some gaps in the order of ten or so, to allow a repeated and delayed deepening of the rankings. Also, your association with the ZX Spectrum Basic is interesting. (Btw: Out of curiosity, are you German? Your name sounds very much like it.) – Columbo Dec 16 '14 at 19:01
  • Yes, I'm originally from Berlin. As far as I can tell, my first name is only common in Germany and the u-umlaut in my second probably implies Germany or Turkey :-) I have updated my profile correspondingly. – Dietmar Kühl Dec 16 '14 at 19:09
9

Concepts prefers a more constrained overload over a less constrained overload, so you don't need to exclude the domain of a constrained implementation from the domain of an unconstrained implementation as you would with SFINAE. Your basic implementation could be written as:

template <typename It>
std::size_t distance(It it, It end) {
    std::size_t size{};
    while (it != end) {
        ++it;
        ++size;
    }
    return size;
}

template <typename It>
requires is_random_access_v<It>
std::size_t distance(It begin, It end) {
    return end - begin;
}

it's not necessary to exclude random access iterators (the domain of the constrained overload) from the domain of the unconstrained overload.

If all segmented iterators are random or all random iterators are segmented then again Concepts will prefer the more constrained overload and everything is fine. You just add the new constrained overload:

template <typename It>
requires SegmentedIterator<It>
std::size_t distance(It begin, It end) {
    // ...
}

If you have constrained overloads with overlapping ranges but neither subsumes the constraints of the other, overload resolution is ambiguous just as with SFINAE. Breaking the ambiguity is however a bit simpler, since it's only necessary to add a new overload to specify the behavior in the region of overlap:

template <typename It>
requires SegmentedIterator<It> && is_random_access_v<It>
std::size_t distance(It begin, It end) {
    // ...
}

SFINAE would require you to additionally exclude the overlap from the domain of the other overloads, but Concepts will prefer this more-constrained overload without requiring changes to the overloads for SegmentedIterator and is_random_access_v.

Concepts allows a user to easily extend your generic implementation with orthogonal overloads. Non-orthogonal overloads require more effort to specify behavior in the "overlap", but don't require changes to the original code as SFINAE would.

Casey
  • 41,449
  • 7
  • 95
  • 125
  • 3
    It seems the choice Concepts make can be implemented using a suitable category-tag hierarchy, although possibly using a simpler notation: with Concepts conjunctions are used to impose a weak order on concepts while category-tags use inheritance. Thinking in terms of a weak order on the concepts and an ambiguity being a choice of incomparable options actually implies that these need to be resolved - something I hadn't realized prior to your answer. – Dietmar Kühl Dec 16 '14 at 18:59
3

Note that you can "emulate" concepts in C++11 using Walter Brown's void_t trick (see void_t "can implement concepts"?).

Then you can provide a base implementation as a class template

template <typename It, class=void>
struct dist_impl {
auto operator()(It it, It end) -> std::size_t {
    std::size_t size{};
    while (it != end) {
        ++it;
        ++size;
    }
    cout << "base distance\n";
    return size;
}
};

and do partial specialization with void_t to let compiler pick the most specialized match

template <typename It>
struct dist_impl<It, void_t<typename std::enable_if<is_random_access<It>::value>::type>> {
auto operator()(It begin, It end) -> std::size_t {
    cout << "random distance\n";
    return end - begin;
}
};

The same "orthogonality" considerations apply.

Here is a complete example: http://coliru.stacked-crooked.com/a/e4fd8d6860119d42

Community
  • 1
  • 1
Roman L
  • 3,006
  • 25
  • 37
3

Overloading

First, let's take a look at just handling the distance function. Using the Tick library, you can implement concept traits for iterator traversals in C++ like this:

TICK_TRAIT(is_incrementable)
{
    template<class T>
    auto requires_(T&& x) -> tick::valid<
        decltype(x++),
        decltype(++x)
    >;
};

TICK_TRAIT(is_decrementable, is_incrementable<_>)
{
    template<class T>
    auto requires_(T&& x) -> tick::valid<
        decltype(x--),
        decltype(--x)
    >;
};

TICK_TRAIT(is_advanceable)
{
    template<class T, class Number>
    auto requires_(T&& x, Number n) -> tick::valid<
        decltype(x += n)
    >;
};

Now, if you write the two overloads it could be ambigous. So there are a couple of ways to resolve the ambiguity. First, you could use tag dispatching:

template <typename It>
auto distance(It it, It end, tick::tag<is_incrementable>) -> std::ptrdiff_t 
{
    std::ptrdiff_t size{};
    while (it != end) {
        ++it;
        ++size;
    }
    return size;
}

template <typename It>
auto distance(It begin, It end, tick::tag<is_advanceable>())
{
    return end - begin;
}

template<typename It, TICK_REQUIRES(is_incrementable<It>())>
auto distance(It begin, It end)
{
    return distance(begin, end, tick::most_refined<is_advanceable<It>());
}

Another way is to use conditional overloading provided by the Fit library. This lets your order the functions by importance in order to avoid ambiguity. You can use function objects or lambdas. Here is how to do it using generic lambdas:

FIT_STATIC_FUNCTION(distance) = fit::conditional(
    [](auto begin, auto end, TICK_PARAM_REQUIRES(
        tick::trait<is_incrementable>(begin) and 
        tick::trait<is_incrementable>(end)))
    {
        std::ptrdiff_t size{};
        while (it != end) {
            ++it;
            ++size;
        }
        return size;
    },
    [](auto begin, auto end, TICK_PARAM_REQUIRES(
        tick::trait<is_advanceable>(begin) and 
        tick::trait<is_advanceable>(end)))
    {
        return end - begin;
    }
);

Of course, this makes it a function object, which you would have to wrap in an actual function if you want to rely on ADL lookup.

Customization points

Can generic algorithms be implemented such that new improvement idea can be added without changing the existing implementations and, if so, how?

Yes they can, but you need to define customization points.

ADL Lookup

One way is through ADL lookup. The std::begin and std::end functions work in this way. So you could define the distance function in its own private namespace:

namespace detail {
    template<typename It, TICK_REQUIRES(is_incrementable<It>())>
    auto distance(It begin, It end)
    {
        // Implementation of distance
    }
}

Then you can define another function for the user to use in another namespace like this:

namespace my_lib {
template<typename It, TICK_REQUIRES(is_incrementable<It>())>
auto distance(It begin, It end)
{
    using detail::distance;
    distance(begin, end);
}
}

So now you can customize the distance function for certain types.

Template specialization

However, ADL could be inadvertently hijacked, and it would cause this to fail sometimes. So another way to provide customization points is to use template specialization. So you could define template that could be use to override the behaviour of distance, like this:

template<class It, class=void>
struct distance_op;

So then the distance function could be defined to prefer the distance_op first:

 FIT_STATIC_FUNCTION(distance) = fit::conditional(
    [](auto begin, auto end) FIT_RETURNS
    (distance_op<decltype(begin)>::call(begin, end)),
    [](auto begin, auto end, TICK_PARAM_REQUIRES(
        tick::trait<is_incrementable>(begin) and 
        tick::trait<is_incrementable>(end)))
    {
        std::ptrdiff_t size{};
        while (it != end) {
            ++it;
            ++size;
        }
        return size;
    },
    [](auto begin, auto end, TICK_PARAM_REQUIRES(
        tick::trait<is_advanceable>(begin) and 
        tick::trait<is_advanceable>(end)))
    {
        return end - begin;
    }
);

The FIT_RETURNS will constrain the lambda to when distance_op<decltype(begin)>::call(begin, end) is valid. So if you wanted to customize distance for std::queue, you could write:

template<>
struct distance_op<queue<int>::iterator>
{
    static void call(queue<int>::iterator begin, queue<int>::iterator end)
    {
        // Do queue-based distance
    }
};

Also, the second paramter is there so you can specialize it based on types that match certain constraints, so we could implement it for every iteratar where is_queue_iterator is true, like this:

template<Iterator>
struct distance_op<Iterator, TICK_CLASS_REQUIRES(is_queue_iterator<Iterator>())>
{
    static void call(queue<int>::iterator begin, queue<int>::iterator end)
    {
        // Do queue-based distance
    }
};

Concept Maps

Optional follow-up (I'm primarily interested in the question above but this one could be interesting, too): If that isn't possible, would this ability be added with concepts?

Yes, using concept maps you could easily extend these operations. So you could create a distance concept like this:

template<class Iterator>
concept Distance
{
    ptrdiff_t distance(Iterator begin, Iterator end);
}

Then we make a concept_map for when it is a Incrementable and when it is Advanceable:

template<Incrementable Iterator>
concept_map Distance<Iterator>
{
    ptrdiff_t distance(Iterator begin, Iterator end)
    {
        std::ptrdiff_t size{};
        while (it != end) {
            ++it;
            ++size;
        }
        return size;
    }
};

template<Advanceable Iterator>
concept_map Distance<Iterator>
{
    ptrdiff_t distance(Iterator begin, Iterator end)
    {
        return end - begin;
    }
};

And then later on the user could specialize the concept_map for new types as well:

template<class T>
concept_map Distance<queue<T>::iterator>
{
    ptrdiff_t distance(Iterator begin, Iterator end)
    {
        return end - begin;
    }
};
Paul Fultz II
  • 17,682
  • 13
  • 62
  • 59
  • 2
    Well, `concept_maps` are dead: the are from the original concepts proposal which was pulled. The current Concepts are [Concepts Light](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3889.pdf) which don't have concept maps. Otherwise I think the approaches described in this answer are just a refinement of how to implement a closed set of generic solutions and expand it for specific types. I'm seeking an approach allowing the addition of _generic_ implementations taking advantage of specialized operations. – Dietmar Kühl Dec 17 '14 at 02:29
0

I would use an utility class for that, because in that case it is easy to give it a default algorythm (for the generic case) keeping the possibilite to override it for specific uses. More or less what classes from the STL do with Allocator :

template < class T, class Alloc = allocator<T> > class list;

By default, you get an allocator<T> but can supply you own implementation.

template <class T, class Dist = dist<T> >
class dist_measurer {
public:
    static auto distance(T begin, T end) {
        return Dist.distance(begin, end);
    }
}

Then you create the generic dist<T>, and optionaly other specific implementations all with one single static method distance.

When you want to use the generic method on class X :

dist_measurer<X>.distance(x, y); // x and y objects of class X

If you have implemented another algorithm in dist2, you use it with :

dist_measurer<X, dist2<X> >.distance(x, y);
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • The objective is to have the algorithm implementation somehow figure out the ideal version automatically. `distance()` (or the algorithms I'm actually interested in) would be used in the body of other algorithms and effectively having the user specify the version to use isn't viable. – Dietmar Kühl Dec 16 '14 at 19:57