Incorporating boilerplate vector<> code in C++

Question

There are certain functions I use in manipulating vector<T> that come up a lot but for which the standard interface is clunky.

For example, suppose v is if of type vector<T> for a typename T. Ideally I would like to make calls like:

 v.probe(x) //returns true if x is in v
 v.sort() // sort v
 v.unique() // unique elements of x
 v.locate(x) // pointer to the element in v equal to x if it exists, otherwise NULL
 v.cat(w) // concatenate vector w to x
 v.erase(x) // erase all x’s from v

and so on.

These can all be done in stl, but the interface is clunky and wordy. For example, v.probe(x) would be something like

 std::find(v.begin(),v.end(),x)!=v.end()

and v.sort is

std::sort(v.begin(),v.end())

which makes std::sort very awkward to use in the case of complex lvalue expressions, requiring a temporary. (I.e., I cannot easily sort foo->bar.names[3] without a temporary.

Getting unique values of v in STL is even more ridiculously clunky, requiring, I believe:

std::erase(std::unique(std::sort(v.begin(),v.end()).end(),v.end())

I assume that virtually every C++ programmer has run into this issue or issues like them.

What is the best way around this?

I have considered 3 options:

Write special purpose code for each type of vector<> I use in the code.

Write a template header for common vector functions

Have a personal vector<T> class K<T> that subclasses both vector<T> and a mixin class algorithm_vector<T> with the algorithms I need.

Option 1 seems simple but gets very wordy after a while. Option 2 is not as simple as it seems. Consider writing a special function probe, like

    template<typename T> probe(const vector<T> & v, const T &x)....

Well, the thing is that we actually only want to pass in x by reference if the size of T is large, otherwise we want to use value. I don’t even know how to write a template function that intelligently decides whether to pass its argument by value or reference, and, even if I did, it sounds hard to do.

Option 3 is probably the cleanest, but has semantic issues that make it unclear.

In conclusion, my question this: what is the best way to add common, simple generic functions on vectors to a program?

(Also, as an optional point which might shed some insight into this, I don’t understand why STL makes it so wordy and awkward to do common things like search a vector for an element, or sort a vector. Is there some reason that STL make the most common usages so wordy, and doesn’t overload to default on the whole container?)

Pipable functions! http://pfultz2.com/blog/2013/02/14/extensions/ — ta.speot.is, Oct 19 '14 at 13:50

Vlad from Moscow · Answer 1 · 2014-10-19T13:50:21.917

5

I would use neither approach and select using standard algorithms. They are well-known and any programmer who will read your code will understand what you are trying to do.:)

For example function

template<typename T> probe(const vector<T> & v, const T &x)....

will only confuse readers. When I see standard algorithm std::find I need not to scroll your code that to find the definition of the function. When I will see function probe I need to scroll your code that to find the function definition and to understand what the function does. :)

edited Oct 19 '14 at 13:50

answered Oct 19 '14 at 07:52

Vlad from Moscow

301,070
26
186
335

I agree that the OP's `probe` function is not an improvement. However, consider `contains( 5, v )` versus `std::find( v.begin(), v.end(), 5 ) != v.end()`. Melikes the former. ;-) – Cheers and hth. - Alf Oct 19 '14 at 15:55
@Cheers and hth. - Alf In this case you can use std::any_of with standard predicate std::rqual_to. – Vlad from Moscow Oct 19 '14 at 16:17

score 3 · Answer 2 · answered Oct 19 '14 at 08:18

The important idea behind the STL is the three aspects containers, iterators and algorithms. The important observation was that most algorithms only differ in the kind of iterator they require, i.e. that they are ignorant to the underlying container (if any). This brings with it a lot of flexibility, but indeed, it creates a slightly clunky interface.

A slightly more modern approach comes from the observation that iterators usually come in pairs. Combining two iterators, you get a range. Check out Boost's Range library, which is based on this observation. Using this library or maybe just the idea behind it should give you the means to keep the flexibility while providing a less verbose syntax.

I'll be glad when/if ranges ever make it into the standard library. — Tim Seguine, Oct 19 '14 at 09:28

score 1 · Accepted Answer · answered Oct 19 '14 at 17:11

I would define some as-general-as-possible simple helper functions, e.g.

#include <algorithm>
#include <vector>
#include <iterator>
#include <stddef.h>         // ptrdiff_t

#define CPPX_ITEMS_OF( c )  std::begin( c ), std::end( c )

namespace cppx {
    using std::begin;  using std::end;
    using std::ostream;

    using Size = ptrdiff_t;

    template< class Container >
    auto n_items( Container const& c )
        -> Size
    { return end( c ) - begin( c ); }

    template< class Value, class Container >
    auto contains( Value&& value, Container&& container )
        -> bool
    { return (find( CPPX_ITEMS_OF( container ), value ) != container.end()); }

    template< class Container >
    void sort( Container&& c ) { sort( CPPX_ITEMS_OF( c ) ); }

    template< class Container, class EnableIf_ = typename Container::value_type >
    auto operator<<( ostream& stream, Container const& c )
        -> ostream&
    {
        stream << "{";
        bool first = true;
        for( auto const& value : c )
        {
            if( !first ) { stream << ", "; }
            stream << value;
            first = false;
        }
        stream << "}";
        return stream;
    }

    template< class Container >
    auto uniqued( Container&& c )
        -> decltype( begin( c ) )
    { return unique( CPPX_ITEMS_OF( c ) ); }

    template< class It, class Container >
    void erase_from( It const it, Container&& c ) { c.erase( it, c.end() ); }

    template< class Container >
    void shorten_to_unique( Container&& c ) { erase_from( uniqued( c ), c ); }

    template< class Value, class Container >
    auto find( Value const& v, Container const& c )
        -> decltype( begin( c ) )
    { return find( CPPX_ITEMS_OF( c ), v ); }

    template< class Dest_container, class Source_container >
    void append_to( Dest_container& dest, Source_container const& src )
    {
        dest.reserve( dest.size() + n_items( src ) );
        for( auto const& v : src ) { dest.push_back( v ); }
    }

    template< class Value, class Container >
    void remove_all( Value const& v, Container&& c )
    {
        c.erase( remove( CPPX_ITEMS_OF( c ), v ), c.end() );
    }
}    // namespace cppx

Direct use of standard library, compared to use of these functions:

#include <iostream>
using namespace std;
using cppx::operator<<;

#define T( e ) (cout << #e << "  >>>  ", e)

void use_std()
{
    vector<int> v = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};

    cout << T( find( v.begin(), v.end(), 3 ) != v.end() ) << endl;
    cout << T( find( v.begin(), v.end(), 7 ) != v.end() ) << endl;

    T( sort( v.begin(), v.end() ) );  cout << v << endl;
    T( v.erase( unique( v.begin(), v.end() ), v.end() ) );  cout << v << endl;
    cout << T( *find( v.begin(), v.end(), 9 ) ) << endl;
    T( ([&](){ vector<int> const x{ 5, 5, 5  }; copy( x.begin(), x.end(), back_inserter( v ) ); }()) );  cout << v << endl;
    T( v.erase( remove( v.begin(), v.end(), 5 ), v.end() ) );  cout << v << endl;
}

void use_cppx()
{
    using namespace cppx;
    vector<int> v = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};

    cout << T( contains( 3, v ) ) << endl;
    cout << T( contains( 7, v ) ) << endl;

    T( sort( v ) );  cout << v << endl;
    T( shorten_to_unique( v ) );  cout << v << endl;
    cout << T( *find( 9, v ) ) << endl;
    T( append_to( v, vector<int>{ 5, 5, 5  } ) );  cout << v << endl;
    T( remove_all( 5, v ) );  cout << v << endl;
}

auto main() -> int
{
    cout << boolalpha;

    cout << "    Using standard library:" << endl;
    use_std();
    cout << "\n* * *\n" << endl;
    cout << "    Using wrappers:" << endl;
    use_cppx();
}

Output:

    Using standard library:
find( v.begin(), v.end(), 3 ) != v.end()  >>>  true
find( v.begin(), v.end(), 7 ) != v.end()  >>>  false
sort( v.begin(), v.end() )  >>>  {1, 1, 2, 3, 4, 4, 5, 5, 6, 9}
v.erase( unique( v.begin(), v.end() ), v.end() )  >>>  {1, 2, 3, 4, 5, 6, 9}
*find( v.begin(), v.end(), 9 )  >>>  9
([&](){ vector<int> const x{ 5, 5, 5 }; copy( x.begin(), x.end(), back_inserter( v ) ); }())  >>>  {1, 2, 3, 4, 5, 6, 9, 5, 5, 5}
v.erase( remove( v.begin(), v.end(), 5 ), v.end() )  >>>  {1, 2, 3, 4, 6, 9}

* * *

    Using wrappers:
contains( 3, v )  >>>  true
contains( 7, v )  >>>  false
sort( v )  >>>  {1, 1, 2, 3, 4, 4, 5, 5, 6, 9}
shorten_to_unique( v )  >>>  {1, 2, 3, 4, 5, 6, 9}
*find( 9, v )  >>>  9
append_to( v, vector<int>{ 5, 5, 5 } )  >>>  {1, 2, 3, 4, 5, 6, 9, 5, 5, 5}
remove_all( 5, v )  >>>  {1, 2, 3, 4, 6, 9}

As I see it the complexity and redundancy of direct standard library use is a strong argument in favor of well-designed wrappers.

One should not have to actively decipher the source code, as with the direct use of the standard library.

@kdog: Good observation! This is a common problem with macros (it's easy to use them incorrectly), part of why macros are generally regarded as Evil, and you can find more information about it in any good C++ introductory text book. Even though the Stack Overflow [C++ book list](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) is now nothing like its former self, I do recommend taking a look there. — Cheers and hth. - Alf, Oct 19 '14 at 19:41
@kdog: But, what's generally Evil can be Good, or at least the *least Evil*, in certain contexts. E.g., depending on the coding guidelines you may not be able to use `#pragma once`, and then you have to use macro symbols as header guards. And e.g., in the above code you really would not want to express it without the macro `T`, because that would introduce redundancy where it would be too easy to slip up. The `CPPX_ITEMS_OF` macro is more borderline: it provides great convenience, and so it's good to have. If you don't like it you can therefore just remove it (do expand each use of it first). — Cheers and hth. - Alf, Oct 19 '14 at 19:49
n.b. i deleted my original comment it's not so much of a problem here as you point out. btw what is the advantage of the auto f() -> int or whatever syntax instead of int f() in function declaration? — kdog, Oct 19 '14 at 19:59
@kdog: Well as I see it, it makes it easy to scan for function names (always at roughly the same place), it makes it possible to use unqualified type for the result type of an out-of-class member function definition, and it's a single uniform declaration syntax. And one is technically forced to use it here and there. Thus I would need some really good argument to drag the old declaration syntax into that: it has no advantages that I know of. See e.g. (http://stackoverflow.com/questions/11215227/should-the-trailing-return-type-syntax-style-become-the-default-for-new-c11-pr/11215414#11215414). — Cheers and hth. - Alf, Oct 19 '14 at 20:18

Vladimir Fedosov · Answer 4 · 2014-10-19T13:42:52.940

0

STL iterator don't knows when all objects are passed. This is the advantage and disadvantage at the same time. The advantage because the sintaksis for (vector::iterator i = v.begin(); i != v.end(); ++i) ... is very close to pointers sintaksis and it is no problem for compiler to make very effective machine code and you can use pointers as iterators in templates. The disadvantage you describe in your post - you always need to take 2 iterators - for any work with collection.

edited Oct 19 '14 at 13:42

answered Oct 19 '14 at 12:52

Vladimir Fedosov

92
3

sort(v) , obviously all objects are passed. – kdog Oct 19 '14 at 19:40

score 0 · Answer 5 · answered Oct 19 '14 at 15:30

Define your own Vector class, in a separate namespace. Don't derive from std::vector, since stl containers don't have virtual destructors and you may end up wit some funky side effects. Instead own a std::vector inside your class, and declare all methods you may find useful:

a quick example:

template <typename T>
struct Vector
{
 //returns true if x is in v
 bool probe(x) { std::find(v.begin(),v.end(),x)!=v.end() }
private:
 std::vector<T> v_;
}

The chore of defining ctor/dtor/copy constructors etc is left to the reader Be aware that this approach is against the "good practices" of C++, where usually we keep data separated from algorithms. This is the foundation of the stl. I'd suggest you this interesting read, by Stepanov, the father of the stl. http://www.stepanovpapers.com/notes.pdf

Mainly it's a matter of code reuse, in the stl you can have algorithms that can be used on any stl container, or any other container for that matter. All it needs two interators and a comparison function. In other languages they solved this problem by having a base class from which every other container inherits from (Object in Java), and having methods on the object itself.

Makes sense. Some small typos in your example: probe(x) should be probe(const T&x), and v should be v_ . I actually do this a bit but there is a *lot* of boiler plate code one needs even just to write Vector. You need const and non_const versions of all the iterators, the subscript operator, and to explicitly overload all the methods of vector<> - it's a lot more work than it looks like. Question: isnt there some simple way to inherit from std::vector if the derived class has no new members, so no new destructors needed? — kdog, Oct 19 '14 at 19:46

Incorporating boilerplate vector<> code in C++

5 Answers5