19

I'm wondering why the STL doesn't overload their algorithm functions such that I can call them by simply providing a container and not taking the more verbose way to pass begin + end iterators. I of course understand why we also want to use an iterator pair for processing subsequences of a container / array, however, almost all calls to these methods are using a whole container:

std::for_each(myVector.begin(), myVector.end(), doSomething);

I'd find it more convenient, readable and maintainable to just write

std::for_each(myVector, doSomething);

Is there a reason STL doesn't provide these overloads? [EDIT: I don't mean to replace the interface with this restricted one but to also provide a container-based iterface!] Do they introduce ambiguity? I'm thinking about something like this:

template<typename _Container, typename _Funct>
inline _Funct for_each(_Container c, _Funct f) {
    return for_each(begin(c), end(c), f);
}

Am I missing something?

leemes
  • 44,967
  • 21
  • 135
  • 183
  • 1
    Why bother? Many algorithms are very useful on iterators that are not immediately obtained as `x.begin()`, and there's no benefit in providing a more more restrictive interface... And when it gets to *output*, it would make even less sense to provide anything but an iterator. – Kerrek SB Dec 22 '12 at 14:28
  • 5
    @KerrekSB Did you misunderstand something? I'm talking about an overloaded alternative interface, not resticting the interface. I know and I also emphasized in my question that we sometimes want to provide iterators. I'm talking about the more readable alternative when we want to use the begin and end iterators. Also, I answered your question "Why bother?" in my question: Better readability, maintainability and convenience in the cases where we want this *special case*. – leemes Dec 22 '12 at 14:30
  • 4
    Use Boost.Range? http://www.boost.org/doc/libs/1_52_0/libs/range/doc/html/range/reference/algorithms.html – Pubby Dec 22 '12 at 14:36
  • 3
    I understand that, but my point remains. You can already write your example as `for (auto const & x : myVector) { doSomething(x); }`, which is plenty readable, and you're free to write your own convenience wrappers... but once you start, there's no end to asking why isn't XYZ also in the standard library. You have the building blocks, but somebody had to draw a line somewhere. – Kerrek SB Dec 22 '12 at 14:36
  • @KerrekSB Good point about the "line". The `for_each` just was a simple example to get the point of what I want to say. I could also have written `min_element` in which case we don't want to write a range-based loop. – leemes Dec 22 '12 at 14:40

7 Answers7

18

They do introduce ambiguity for many algorithms. A lot of <algorithm> looks like

template<class iterator>
void do_something(iterator, iterator);

template<class iterator, class funct>
void do_something(iterator, iterator, funct);

If you add additional overloads

template<class container, class funct>
void do_something(container, funct);

the compiler will have some trouble figuring out what do_something(x, y) means. If x and y are of the same type, it will match both iterator = type and container = type, funct = type.*)

C++11 tried to solve this with "concepts" that could recognize the difference between a container and an iterator. However, these "concepts" turned out to be too complicated to make it into the standard, so neither did these overloads.

*) compilers disagree here, the Comeau compiler claims that it is ambiguous, g++ 4.5 and MSVC 10 calls the first function.


After an extremely long discussion in the comments, here is one example where it doesn't work as expected - using a container adapter that can also double as a predicate.

#include <iostream>
#include <vector>

template<class iterator>
void test(iterator, iterator)
{
   std::cout << "test iterator\n";
}

template<class iterator, class predicate>
void test(iterator, iterator, predicate)
{
   std::cout << "test iterator, predicate\n";
}

template<class container, class predicate>
void test(const container& cont, predicate compare)
{
   std::cout << "test container, predicate\n";

   test(cont.begin(), cont.end(), compare);
}

template<class container>
class adapter
{
public:
   typedef typename container::iterator   iterator;

   adapter(container* cont) : cont(cont)
   { }

   iterator begin() const
   { return cont->begin(); }

   iterator end() const
   { return cont->end(); }

   bool operator()(const iterator& one, const iterator& two)
   { return *one < *two; }

private:
   container* cont;
};

int main()
{
   std::vector<int>   v;

   adapter<std::vector<int>>   a(&v);

   test(a, a);

}

Output:

test iterator

http://ideone.com/wps2tZ

Community
  • 1
  • 1
Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • Finally, this is the answer I was looking for. I guessed that problem... Thanks! (I know concepts and I'd love to see them soon, however, Stroustrup said it will still take a couple of years as he/they want to make sure that it is perfectly specified.) – leemes Dec 22 '12 at 15:22
  • 1
    One last thing: Can't be almost everything you want to specify with a concept also be specified with `enable_if`? I don't know enough to answer my question by myself, but to me it looks like a possible solution. Still, the point is: Maybe they didn't want to add such an interface because they decided to "draw the line" at this point... – leemes Dec 22 '12 at 15:27
  • 1
    @leemes: I wonder why you didn't ask this question: *"the cases where ambiguity arises, can be solved by explicitly passing the template arguments, and fortunately we don't have **real** such cases in the standard algorithm (which is why Bo Persson didn't come up with one :P), so why we don't have the overloads?"* – Nawaz Dec 22 '12 at 15:31
  • @Nawaz Example: `min_element(first, last)` vs. `min_element(first, last, compareFunction)` (and everywhere we can provide a compare method which defaults to `operator<`) -- The thing is, I only thought that *maybe* this overload introduces ambiguity, but with thinking more I could have answered my question by myself. Shame on me. :) The thing with the concepts / enable_if still counts, however. – leemes Dec 22 '12 at 15:36
  • @leemes: How is that a valid example? (it seems you didn't understand this answer also). – Nawaz Dec 22 '12 at 15:38
  • @Nawaz If we'd introduce `min_element(container, function)` and call `min_element(myVector.begin(), myVector.end())`, which one is called? Since it's templated without type restriction, it's ambiguous. That's the whole point of my question, and yes, I understood the answer perfectly fine. – leemes Dec 22 '12 at 15:39
  • @leemes: Try doing that. It will call the iterator version. ;-) – Nawaz Dec 22 '12 at 15:40
  • But not if the iterator has .begin() and .end() defined, both returning new iterators + is callable, in which case also the shorter container version fits. Would be strange, I know, however, it's ambiguous *in general*, since STL isn't restricted to STL containers (which don't define such strange methods...) Also, this is the reason why we don't encounter problems when defining and using such overloads in practice. – leemes Dec 22 '12 at 15:49
  • @leemes: actually, this has to do with the `Iterator` version being more specialized (as it requires both types to be equal), see [this](http://liveworkspace.org/code/2KxC8X$0). Weird things would happen though if you had an `iterator` and a `const_iterator`, see [this](http://liveworkspace.org/code/8uij7$1). Here the ambiguity requires `enable_if` magic... and this can get hairy quickly. – Matthieu M. Dec 22 '12 at 15:54
  • @leemes: *"But not if the iterator has .begin() and .end() defined, both returning new iterators + is callable, in which case also the shorter container version fits"*. Is this a *real* example? Why don't you question the design of such iterator in the first place, which has `begin` and `end()` members? It smells a bad design, no? – Nawaz Dec 22 '12 at 15:54
  • @MatthieuM.: Your second example will not work anyway (i.e even if you remove the container version). It is in fact undefined behavior. `enable_if` will not help you there. – Nawaz Dec 22 '12 at 16:09
  • @Nawaz: I don't see off hand why it would be undefined behavior. – Matthieu M. Dec 22 '12 at 16:17
  • @MatthieuM.: I mean if you make it work by implementing a function that takes `iterator` and `const_iterator` pair. I think `x.begin() != x.cend()` invokes UB. As for the existing algorithm, the code wouldn't even compile. Anyway, the point was that the problem in your second example has nothing to do with overloads that take container instead of iterator-pairs, and it is not ambiguity problem either. – Nawaz Dec 22 '12 at 16:34
  • I tried to create a quick and dirty example where the overload fails (is ambiguous). I failed to do so. I got it so far that it is callable in both versions (first, last) or (container, functor) when calling it with `m.begin(), m.end()`. You can test it by making a separate function out of the overloaded function (call it for example `mymin_element`) and it will find the minimum element from `m.begin()`, being `'f'`. Maybe someone can try to provoke the ambiguity. http://ideone.com/Ecljue – leemes Dec 22 '12 at 16:38
  • @Nawaz: actually, [it works fine](http://liveworkspace.org/code/8uij7$4) and I cannot think of any reason for going "undefined" there; conversion from `iterator` to equivalent `const_iterator` is always well defined and comparisons are allowed. However now it's definitely ambiguous with a container/func overload. – Matthieu M. Dec 22 '12 at 16:57
  • @MatthieuM.: Alright, it seems `iterator` is convertible to `const_iterator`, which means comparing them is well-defined. But then, as I said, the problem in the second version has nothing to do with overloads that take container. – Nawaz Dec 22 '12 at 17:03
  • 1
    @Nawaz: Yes, realized that. – Matthieu M. Dec 22 '12 at 17:23
  • @Bo Persson: This answer turns out to be an incorrect answer (read the comments above). If you believe it is correct, then you must demonstrate with an example. – Nawaz Dec 22 '12 at 17:35
  • I added an example where it doesn't work (though the compilers seem to disagree on *why* it doesn't work). – Bo Persson Dec 22 '12 at 20:32
10

Unfortunately, this is a far more generic problem; namely, that iterators were designed to beat those crappy C APIs and Java-style "Put the algorithms as methods of each individual container" solutions. They are the first-generation generic solution and there's no surprise that, on reflection, they were not as good as other possible generic solutions obtainable after we spend twenty years thinking about it.

Adding these container overloads would be just band-aiding over a tiny part of the problem space; and it might even make things worse in the future. The solution is ranges, which C++ is looking to introduce ASAP.

Bejmax
  • 945
  • 7
  • 16
Puppy
  • 144,682
  • 38
  • 256
  • 465
  • "They are the first-generation generic solution..." Hmm, this is C++11 and STL algorithms already existed before. To me, it doesn't sound like the first attempt to get it "right". (With "right" I only mean this tiny part which, in my opinion, is an easy step which STL could have done easily.) For the range thing, this pretty much sounds like the perfect solution which I am missing here. – leemes Dec 22 '12 at 14:43
  • 2
    @leemes: the STL *is* a first-generation generic solution. C++11 is still backward compatible (mostly) with C++03, and cannot ditch the older forms in favor of the newer ones: they would have to coexist. Unfortunately, there would be ambiguities between them as it stands now (look at Bo Persson answer as to the explanation). However nothing is definitive, and if we ever get concepts in, we might finally get ranges *in the STL*. For now, Boost.Range should alleviate your woes. – Matthieu M. Dec 22 '12 at 15:25
  • @MatthieuM. Yes, I realized that they have to co-exist, this is why I said "overloaded interfaces" in my question. I already guessed that it will be ambiguous. I hope it's OK that I accepted Bo Persson's answer because this is exactly the point I wanted to know. – leemes Dec 22 '12 at 15:29
3

To understand that I think one must have to understand the philosophy of C++ algorithms. Lets ask this question first:

Why C++ algorithms are implemented as free functions instead of member functions?

Well the answer is pretty much simple : to avoid implementation explosions. Suppose you have M containers and N algorithms, and if you implement them as members of the containers, then there will be M*N implementations. There are two (related) problems in this approach:

  • First, it doesn't make use of code reuse. Most of the implementations will be repeated.
  • Second, implementation explosions, which is a direct consequence of the above.

C++ solves these issues by implementing them as free functions, so that you have only N implementations. Each of the algorithm that operates on a container takes a pair of iterators, that defines the range. If you want overloads that take container, instead of pair of iterators, then the Standard have to provide such overloads for each of the algorithm, and there will be 2*N implementations which pretty much defeat the very purpose why C++ has separated the algorithms from the containers in the first place, and half of these functions don't do anything which cannot be done by the other half.

So I don't think it is that much an issue. Just to avoid one single argument, why implement N more functions (which also impose some restriction on its usage such as you cannot pass pointers to it)? However, if programmers want such functions in their utility, they can implement them anytime along with many others based on the Standard algorithm!


You commented:

Well, the 2*N implementations are in fact only N implementations. The other N ones are inlined overloads which directly call the "real" version of the algorithm, so they are a header-only thing. Providing container overloads doesn't defeat the purpose to separate algorithms from containers, as (as you can see in my example) they can use templates to handle all types of containers.

Based on this logic, one can very well argue for M*N algorithms. So make them member functions too (and call the free functions internally)? I'm sure many OOP guys would prefer

auto result = container.accumulate(val);

over

auto result = std::accumulate(container.begin(), container.end(), val);
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 2
    Well, the `2*N` implementations are in fact only `N` implementations. The other `N` ones are inlined overloads which directly call the "real" version of the algorithm, so they are a header-only thing. Providing container overloads doesn't defeat the purpose to separate algorithms from containers, as (as you can see in my example) they can use templates to handle all types of containers. – leemes Dec 22 '12 at 14:52
  • 1
    Based on this logic, one can very well argue for `M*N` algorithms. So make them member functions too (and call the free functions internally)? I'm sure many OOP guys would prefer to write `container.accumulate(val)` over `std::accumulate(begin, end, val)`. – Nawaz Dec 22 '12 at 15:07
  • 1
    Totally not, since adding a new container requires adding code in the `M*N` case, but not in the `2*N` case. I'm talking about two interfaces: "call algorithm on a range as specified by two iterators" and "call algorithm on a container as specified by its iterators as returned by `std::begin(...)` and `std::end(...)`". I'm **not** talking about iterfaces as "call algorithm on a vector", "... on a linked list", etc. – leemes Dec 22 '12 at 15:10
  • 1
    @leemes: But it also seems to be a good idea to include them as members, no? It is definitely a good idea for the OOP guys, no? – Nawaz Dec 22 '12 at 15:14
  • It is, but that's not part of my question. However, it can easily be done with preprocessor hacks so we never have to write code for them. And no, I don't like this idea, but it solves the `N*M` problem by writing `O(N + M)` instead of `O(N*M)` code. – leemes Dec 22 '12 at 15:17
  • 1
    @leemes: But one can ask that question which seems to be as good question as the one asked by you. – Nawaz Dec 22 '12 at 15:18
  • Of course. But still, it's not part of my question. But I see what you want to tell me: We have to draw a line where we say: STL implements this, but no more. However, in my personal opinion, container overloads for the algorithms would have been very useful. And yes, I implemented them by myself. The reason that they aren't available per default made me think that it maybe is for a good reason (like ambiguity)... – leemes Dec 22 '12 at 15:20
  • 1
    @leemes: One thing probably you skipped in my answer is that you can implement them anytime, so it is not an issue. It depends on personal taste. – Nawaz Dec 22 '12 at 15:20
3

Here is a relevant answer from Herb Sutter's blog: Why no container-based algorithms. It shows counterexamples just like Bo Persson did in his answer above.

Paul Jurczak
  • 7,008
  • 3
  • 47
  • 72
  • Normally, we;'d ask you to at least summarize the relevant points from a linked page, but considering those points were already addressed by Bo Persson this answer adds no value to the existing answers. It would be more appropriate to add the link as a comment to Bo's answer. – MSalters Oct 01 '14 at 12:52
  • @MSalters _this answer adds no value to the existing answers_: this is subjective. Even if the contents was exactly the same, which is not, a different style of presentation may be valuable for some readers. Take as an example the multitude of books about C++: great number of them _add no value_, but they still get good reviews from happy readers. – Paul Jurczak Oct 01 '14 at 17:17
2

There is a Range Operators library with intention to fix that. Verbosity was cut several times over.

Your example would look something like this:

auto newVector = myVector * doSomething;

Yes, doSomething - is without parenthesis.

Familiar idiom from shell (with std algorithm):

auto t = vector<int>{3,2,1,4} | sort | unique; 
Leonid Volnitsky
  • 8,854
  • 5
  • 38
  • 53
  • 1
    Well, I like the idea having ranges (= iterator pairs), but I don't find your version of algorithm call readable. It's unverbose and easy to write, but you have to know exactly what `*` does. A reader thinks that you are trying to take an inner product or multiply a vector with a scalar or something like that. This is why most people don't like operator overloading (I like it in general). – leemes Dec 22 '12 at 14:54
  • There are people that like IBM's JCL - it is explicit and precise but not terse and simple. And there are people that like Unix command line filter with `|`, where you can replace with one line - pages of JCL. – Leonid Volnitsky Dec 22 '12 at 14:57
  • I totally agree that it is a matter of taste. If you use this library / syntax very frequently in a project, people get used to it, so it should be fine, as it is the case with stream operators `>>` / `<<`, which used to be bit shift operators only in C... – leemes Dec 22 '12 at 15:02
  • 1
    `std::string("Hello World") * std::toupper` reduces the verbosity, true, but it kills the readability severely. – Nawaz Dec 22 '12 at 15:03
  • @Nawaz - I bet there were people that were saying `cout << x` - is confusing! I am not forcing this on everyone. This started as personal codejam library. – Leonid Volnitsky Dec 22 '12 at 15:05
0

It should be pointed out that it's very easy to define your own trivial wrappers to add containerized versions.

For example:

template<typename Container, typename Func>
Func for_each(Container& c, Func f) {
    return std::for_each(c.begin(), c.end(), f);
}

Now you can make the simple call you want. There's no ambiguity because your wrappers aren't in the std namespace. You can define overloads that take const Container&. If you want versions that call the C++-11 const iterator methods (e.g. cbegin()), I think you'll need to name the wrapper differently. I use for_each_const.

divegeek
  • 4,795
  • 2
  • 23
  • 28
0

Obviously, as other users mentioned, it's a tricky problem, so unfortunately it's been a long time and there's still no solution in the standard library. There are, however, already range libraries available such as Boost::Range and the one in the Adobe Source Libraries that provide not only the simplicity of the interface you describe in your question, but some fancier features as well.

Your example works perfectly with Boost (we are using namespace boost::range::adaptors below):

boost::for_each(myVector, doSomething);

We can also slice myVector quickly and easily:

boost::for_each(myVector | sliced(10, 20), doSomething)

We can even zip myVector with another, filter by a predicate, and sample every other element of the resulting pairs in a single, simple statement (this requires that you unpack in doSomethingElse the tuples produced by boost::combined):

boost::for_each( boost::combined(myVector, myOtherVector) | strided(2), doSomethingElse)