0

Say I need a new type in my application, that consists of a std::vector<int> extended by a single function. The straightforward way would be composition (due to limitations in inheritance of STL containers):

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> vec_
}

This requires the user to first construct a vector<int> and a copy in the constructor, which is bad when we are going to handle a sizeable number of large vectors. One could, of course, write a pass-through to push_back(), but this introduces mutable state, which I would like to avoid.

So it seems to me, that we can either avoid copies or keep A immutable, is this correct?

If so, the simplest (and efficiency-wise equivalent) way would be to use a typedef and free functions at namespace scope:

namespace N {
typedef std::vector<int> A;
int a_hash(const A & a);
}

This just feels wrong somehow, since extensions in the future will "pollute" the namespace. Also, calling a_hash(...) on any vector<int> is possible, which might lead to unexpected results (assuming that we impose constraints on A the user has to follow or that would otherwise be enforced in the first example)

My two questions are:

  • how can one not sacrifice both immutability and efficiency when using the above class code?
  • when does it make sense to use free functions as opposed to encapsulation in classes/structs?

Thank you!

bbtrb
  • 4,065
  • 2
  • 25
  • 30
  • What "limitations in inheritance of STL containers" are you talking about? Sure, they can't be derived in polymorphic way, but when you are OK with a wrapper, you don't need that anyway. That said, inheritance is not the right tool if you want the resulting object to be immutable (when the base is not). – Jan Hudec Apr 12 '11 at 05:53
  • Related question: http://stackoverflow.com/questions/679520/advice-on-a-better-way-to-extend-c-stl-container-with-user-defined-methods – Kirill V. Lyadvinsky Apr 12 '11 at 06:01

3 Answers3

6

Hashing is an algorithm not a type, and probably shouldn't be restricted to data in any particular container type either. If you want to provide hashing, it probably makes the most sense to create a functor that computes a hash one element (int, as you've written things above) at a time, then use std::accumulate or std::for_each to apply that to a collection:

namespace whatever { 
struct hasher { 
    int current_hash;
public:
    hasher() : current_hash(0x1234) {}

    // incredibly simplistic hash: just XOR the values together.
    operator()(int new_val) { current_hash ^= new_val; }
    operator int() { return current_hash; }
};
}

int hash = std::for_each(coll.begin(), coll.end(), whatever::hasher());

Note that this allows coll to be a vector, or a deque or you can use a pair of istream_iterators to hash data in a file...

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
0

Ad immutable: You could use the range constructor of vector and create an input iterator to provide the content for the vector. The range constructor is just:

template <typename I>
A::A(I const &begin, I const &end) : vec_(begin, end) {}

The generator is a bit more tricky. If you now have a loop that constructs a vector using push_back, it takes quite a bit of rewriting to convert to object that returns one item at a time from a method. Than you need to wrap a reference to it in a valid input iterator.

Ad free functions: Due to overloading, polluting the namespace is usually not a problem, because the symbol will only be considered for a call with the specific argument type.

Also free functions use the argument-dependent lookup. That means the function should be placed in the namespace the class is in. Like:

#include <vector>
namespace std {
    int hash(vector<int> const &vec) { /*...*/ }
}
//...
std::vector<int> v;
//...
hash(v);

Now you can still call hash unqualified, but don't see it for any other purpose unless you do using namespace std (I personally almost never do that and either just use the std:: prefix or do using std::vector to get just the symbol I want). Unfortunately I am not sure how the namespace-dependent lookup works with typedef in another namespace.

In many template algorithms, free functions—and with fairly generic names—are often used instead of methods, because they can be added to existing classes, can be defined for primitive types or both.

Jan Hudec
  • 73,652
  • 13
  • 125
  • 172
  • 2
    You aren't allowed to put anything new in the std namespace. – Dennis Zickefoose Apr 12 '11 at 06:47
  • 1
    @Dennis: Yes, you are. This isn't one of the things you can add though. The main thing you're allowed to add is a new specialization of an existing template, specialized over a user-defined type. – Jerry Coffin Apr 12 '11 at 13:24
-1

One simple solution is to declare the private member variable as reference & initialize in constructor. This approach introduces some limitation, but it's a good alternative in most cases.

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> &vec_; // 'vec_' now a reference, so will be same scoped as 'vec'
};
iammilind
  • 68,093
  • 33
  • 169
  • 336
  • 2
    That is a reference to an object whose lifetime you have no control over. Bad bad idea. – Dennis Zickefoose Apr 12 '11 at 06:13
  • That's why I mentioned that, it will introduce limitation as it will be same scoped as the original object. – iammilind Apr 12 '11 at 06:28
  • That's a *huge* limitation. It is quite unexpected behavior, and somebody will miss whatever giant warnings you provide about the subject and store `A` without storing `vec` and run up against undefined behavior. Simply don't do it except under specific circumstances, which I wouldn't classify this as one of. – Dennis Zickefoose Apr 12 '11 at 06:45