10

What are the benefits of using boost::any_range? Here is an example:

typedef boost::any_range<
    int
  , boost::forward_traversal_tag
  , int
  , std::ptrdiff_t
> integer_range;

void display_integers(const integer_range& rng)
{
    boost::copy(rng,
                std::ostream_iterator<int>(std::cout, ","));

    std::cout << std::endl;
}

int main(){
    std::vector<int> input{ ... };
    std::list<int> input2{ ... };
    display_integers(input);
    display_integers(input2);
}

But the same functionality with more efficiency can be achieved with a template parameter, which satisfies the ForwardRange concept:

template <class ForwardRange>
void display_integers(const ForwardRange& rng)
{
    boost::copy(rng,
                std::ostream_iterator<int>(std::cout, ","));

    std::cout << std::endl;
}

So I am searching for scenarios when it is worth to use any_range. Maybe I am missing something.

Evgeny Panasyuk
  • 9,076
  • 1
  • 33
  • 54
Gabor Marton
  • 2,039
  • 2
  • 22
  • 33

2 Answers2

17

This technique is called Type Erasure. There is a full article describing the pros and cons on the example of any_iterator: On the Tension Between Object-Oriented and Generic Programming in C++.

It is possible to hide (in a separate file/library) the implementation/definition of

void display_integers(const integer_range& rng)

But in the case of

template <class ForwardRange>
void display_integers(const ForwardRange& rng)

you have to provide source code to users (or at least make explicit instantiations somewhere).

Moreover, in the first case, display_integers will be compiled only once, but in the second it will be compiled for every type of the passed range.

Also, you may have somewhere

integer_range rng;

and during lifetime of rng you may assign ranges of different types to it:

vector<int> v;
list<int> l;
integer_range rng;
rng = v;
rng = l;

The biggest disadvantage of type erasure is its runtime cost; all operations are virtual, and cannot be inlined (easily).


P.S. another famous example of type erasure is std::function

compor
  • 2,239
  • 1
  • 19
  • 29
Evgeny Panasyuk
  • 9,076
  • 1
  • 33
  • 54
  • Besides hiding the implementation, the compiler will generate a single definition of the `display_integers` function, rather than generating a version for each iterator type that is ever used. This is called *code bloat*. – David Rodríguez - dribeas Mar 15 '13 at 21:47
  • @DavidRodríguez-dribeas, I already said this: "in first case display_integers will be compiled only once...". But note, in type erasure case - different "Implementation" classes will be generated when you will pass values of different types, it is not "free". – Evgeny Panasyuk Mar 15 '13 at 22:00
  • Yes, I just wanted to emphasize that the direct implication is reducing the *template code bloat* – David Rodríguez - dribeas Mar 15 '13 at 22:06
  • 1
    Yes, instead of generation code for {Ranges}x{Algorithms}, type erasure (any_range) would involve some code generation for {Ranges}x{any_range specializations}. (of course that is not strict - depends on which combinations are actually used). Anyway, as for me, I see main advantage not in reduced binary code size, but in reduced dependencies and in posibility to fully isolate all implementation details in stand-alone library, without need to expose code to users. – Evgeny Panasyuk Mar 15 '13 at 22:25
  • This is not necessarily true; if the template is coded in such a way as to isolate any non-type related code, the "bloat" caused by templated code can be drastically reduced. This is seen in most implementations of `std::map`, which have a red-black tree implementation in terms of `void*`, which is conserved between any instantiation of `std::map`. In effect, the template is merely an adaptor onto a more generic underlying datatype. – Alice Feb 21 '14 at 08:44
  • @Alice Technique you refer is useful indeed (it was described at [Technical Report on C++ Performance](http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf)). But it cannot be universally applied to every case. For instance, for `sort` function you either know element's `sizeof` at compile time (resulting in performance) or element's `sizeof` is passed at runtime via parameter (resulting in small code size). And from my practice, such trade-off exists in many cases where templates are used (i.e. you cannot get both speed and space, via techniques like `void*` type-erasure). – Evgeny Panasyuk May 07 '14 at 10:14
  • @EvgenyPanasyuk That's a severe case of premature optimization in your example, and one of the leading causes of increased compilation time in C++. The difference between the value being encoded directly into an instruction vs being put into a register is so small that in almost all functions, there is a third concern that trumps both: caching. It is true in theory that direct space/speed trade offs cannot be solved, but caches allow us to subvert that trade off curve in practice. In my experience, playing to the cache has always resulted in better results than theoretical considerations. – Alice May 14 '14 at 11:01
  • @Alice Difference is more significant than you describe: when size in register,we have to do runtime loop in order to copy value(even for size=1),while in case when we know size at compiletime -copying can be just several cheap instructions. Regarding caching and memory issues in general - yes, it is one of main sources of performance loss or gain. But first, there are many algorithms which are already memory-friendly(for instance, starting from some recursion level quicksort works within the lowest cache level);second, templates help to reduce memory indirection count(which type-erasure has). – Evgeny Panasyuk May 15 '14 at 20:35
  • @Alice Your allegations regarding "theoretical considerations" are groundless - this is not "theoretical" thing, but this is one main tools which helps C++ to keep cost of abstraction at very low level, refer http://www.stroustrup.com/new_learning.pdf. Appealing to "premature optimization" is pointless - we are not talking about some specific application, where reasonings about "premature" actions are appropriate. We are talking about techniques in general, and there is nothing wrong to know advantages of each one. – Evgeny Panasyuk May 15 '14 at 20:46
  • @EvgenyPanasyuk It's also one of the things that keeps C++'s compile times enormous, as noted, and this is not for "keeping cost of abstraction low"; D has shown a much better way in pushing such things to link time. There is *everything* wrong with talking about two choices as if they are comparable; the advantages of one so overweight the advantages of the other that it's worthless to talk about the other as if it was viable in all but the most trivial of cases. What you are talking about is comparable to goto vs structured; the advantages are so minor as to warrant no serious discussion. – Alice May 19 '14 at 06:24
10

boost::any_range can used for returning ranges from functions. Imagine the following example:

auto make_range(std::vector<int> v) -> decltype(???)
{   
    return v | filter([](int x){ return x % 2 == 0;})
        | transform([](int x){ return x * 2;});
}

*: gcc does not compile the above without wrapping it in std::function, hower clang 3.2 works by directly passing the lambda

It is very difficult to know what is being returned from this function. Also, lambda and decltype don't work together so we cannot deduce the type using decltype when passing only a lambda. One solution is to use boost::any_range like the one in your example (another workaround is to use std::function as pointed out by Evgeny Panasyuk in the comments):

integer_range make_range(std::vector<int> v)
{
     return v | filter([](int x){ return x % 2 == 0;})
        | transform([](int x){ return x * 2;});
}

Working example with gcc using std::function.

Working example with clang passing lambdas directly.

Community
  • 1
  • 1
Jesse Good
  • 50,901
  • 14
  • 124
  • 166
  • 2
    I think the problem can be solved in an other way as well: Defining a global lambda variable in a detail or hidden namespace, then use the decltype for that lambda variable. Here is a full working example forked from yours: http://liveworkspace.org/code/3hdxki$6 – Gabor Marton Mar 15 '13 at 22:21
  • 1
    @GaborMarton: Right, just keep in mind there probably is [no need for the free function anymore](http://liveworkspace.org/code/3hdxki$8). – Jesse Good Mar 15 '13 at 22:27
  • In that particular case(in answer) - there is no problem with lambda, because you do use std::function. decltype(v | boost::adaptors::filtered(std::function{})) . http://liveworkspace.org/code/czlAX$0 – Evgeny Panasyuk Mar 15 '13 at 22:43
  • @EvgenyPanasyuk: Thanks for pointing that out! I made the example more complicated so `decltype` cannot deduce it. – Jesse Good Mar 15 '13 at 22:56
  • @JesseGood, It is still possible to use decltype - http://liveworkspace.org/code/18kcVF$0. You could just remove std::function, and pass lambda directly into boost::adaptors::filtered - http://liveworkspace.org/code/ErdWe$0 – Evgeny Panasyuk Mar 15 '13 at 22:59
  • @EvgenyPanasyuk: Yes, you could use `std::function` rather than `boost::any_range` as a workaround. I will mention that in my answer. Also, thanks for pointing out that you can pass a lambda directly (I was using gcc which fails to compile the code, so perhaps it is a gcc bug). +1 to your answer for the comments. – Jesse Good Mar 15 '13 at 23:15