4

I very often run into the problem that I want to implement a data structure, and would like to allow users to extend it with functional functionality; that is add functionality but not bytes to the data structure. An example could be extending std::vector with a sum method:

#include <iostream>

#include <vector>

// for the header file
template<>
int std::vector<int>::sum();

//for the object file
template<>
int std::vector<int>::sum() {
    int s=0;
    for(auto v = this->begin(); v!=this->end(); ++v) s+=*v;
    return s;
}


int main() {
    std::vector<int> numbers;
    numbers.push_back(5);
    numbers.push_back(2);
    numbers.push_back(6);
    numbers.push_back(9);

    std::cout << numbers.sum() << std::endl;

    return 0;
}

See: http://ideone.com/YyWs5r

So this is illegal, since one may not add functions to a class like this. Obviously this is some design decision of c++(11). It can be circumvented in two ways, that is defining a

int sum(std::vector<int> &v) { ... }

This is how std::sort works, so I guess it is the way c++(11) is intended. I think this is to the best of my knowledge the best way to do it in c++. However, it does not allow me to access private properties of std::vector. Maybe I am evil by assuming access to private properties in a (sort-of) method is fair. However, often I want users of my classes to not access certain stuff, however would like to allow extenders of my to access them. For example I can imagine that std::sort can be optimized w.r.t. specific container implementation knowledge and access.

Another way is inheriting std::vector, but I find that plain unacceptable for these reasons:

  • if two parties have extended the class with methods, which one would like to use, then one would need to convert from one child class to another. This is ludicrous, as one converts data to data without actually changing bytes, as both child classes (can) have exactly the same memory implementation and segmentation. Please also note that data conversion in general is boilerplate code, and boilerplate code should imho be considered evil.
  • one is unnecessarily mixing functionality with data structures, for example, a class name sum_vector or mean_vector is completely.

As a short reminder, I am not looking for answers like "You cannot do that in c++", I already know that (Add a method to existing C++ class in other file). However, I would like to know if there is a good way to do functional class extensions. How should I manage accessing private fields? What would be reasons why it is unreasonable for me to want private field access; why can't I discriminate between extender and user access?

Note: one could say that an extender needs protected access and a user needs public access, however, like I said, that would be for the inheritance way of extending, and I dislike it strongly for the aforementioned reasons.

Community
  • 1
  • 1
Herbert
  • 5,279
  • 5
  • 44
  • 69
  • 1
    Mandatory reading: [GotW #84](http://www.gotw.ca/gotw/084.htm) – D Drmmr Jun 08 '14 at 12:27
  • @DDrmmr plus ["How non-member functions improve encapsulation"](http://www.drdobbs.com/cpp/how-non-member-functions-improve-encapsu/184401197) – TemplateRex Jun 08 '14 at 12:28
  • Please, don't! Methods are quite a bad thing. Those should be just a syntax sugar for passing structure as first argument. As they aren't, you should avoid methods as often as possible. – polkovnikov.ph Jun 08 '14 at 12:38
  • @polkovnikov.ph : I agree, I prefer functions over methods. However, should I just make data access stuff as public as possible, either by just making fields public or by adding public access methods? – Herbert Jun 08 '14 at 12:42
  • you should first think what interface your data structure should provide, and then provide that interface. Simply making all data members public is a plain aggregate, which does not provide any invariant or other abstraction. – TemplateRex Jun 08 '14 at 12:46
  • @TemplateRex with making stuff public, I mean adding an adequate amount of public methods which allow efficient data access while interfacing as little as possible to implementation details. – Herbert Jun 08 '14 at 12:50
  • @Herbert So either you are writing your own class and you can set the interface at your convenience, or you are using existing libraries and then you can extend the interface through non-member functions that access the public library interface. What exactly is your question then? Please clarify. – TemplateRex Jun 08 '14 at 12:55
  • I am figuring it out :) I did not write a question because everything was quite clear to me already, and still have a lot of reading material ;) Thank you for your comments. – Herbert Jun 08 '14 at 13:21
  • 1
    Extension methods where a great C# improvement, I will be glad if the C++ comitee add something similar in the future. – Manu343726 Jun 08 '14 at 13:21
  • @Manu343726 Yep. Together with traits, case classes, colored local type inference, path-dependent types and implicits, please. – polkovnikov.ph Jun 08 '14 at 17:29

2 Answers2

2

You never should want to access private members of Standard Containers because they are not part of their interfaces.

However, you already can extend the functionality of the Standard Containers like std::vector: namely through the judicious use of iterators and Standard Algorithms.

E.g. the sum functionality is given by a non-member function that uses the begin() and end() functionality of std::vector

#include <algorithm>
#include <iterator>
#include <vector>

template<class Container, class Ret = decltype(*begin(c))>
Ret sum(Container const& c)
{
    return std::accumulate(begin(c), end(c), Ret{});
}
TemplateRex
  • 69,038
  • 19
  • 164
  • 304
  • Here you rely on the fact that the data structure of a std::vector can efficiently be accessed for calculating a sum using iterators; this is not easily possible for each data structure. Consider a graph data structure, where the actual lists of edges and nodes is hidden; allow the user to only create a graph object. On the other hand, an extender knows about graphs and he should have access to these lists; to give the user Dijkstra or other functionality the original class designer did not (want to) think about. – Herbert Jun 08 '14 at 12:34
  • @Herbert A generic graph that hides its edges and nodes is a pretty bad data structure, *unless it is used as an implementation detail*, such as it is in `std::map`. Take a look at [Boost.Graph](http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/index.html) e.g. for a state-of-the-art C++ Graph interface that is as extensible as the Standard Library. – TemplateRex Jun 08 '14 at 12:38
  • @Herbert The general philosophy of the C++ Standard Library is that the separation of `M` containers and `N` algorithms (connected through iterators) allows [`M+N` instead of `M*N` code complexity](http://stackoverflow.com/a/11948413/819272). It's a very effective approach that you should strive to mimic. – TemplateRex Jun 08 '14 at 12:41
  • My specific graph structure is a time-dependent graph (and later on a time-expanded graph) for schedule data; to the best of my knowledge and *googleability* such libraries do not exist yet. However, I do think I get your point about function / data structure separation. – Herbert Jun 08 '14 at 12:48
  • I like your answer, in particular after your N+M N*M story. If you agree and wish, please add a sentence stating that good c++ design principles adhere to classes which only contain elementary methods, such that higher order functionality is added to functions (in the same namespace). Since this prevents duplicate code for higher order functionality on classes with similar interfaces but different implementations. – Herbert Jun 08 '14 at 15:13
  • 1
    Why limit it to vectors? `template auto sum(C const& c) -> C::value_type` ... – aschepler Jun 08 '14 at 15:20
  • @aschepler tnx, updated but with `decltype` so that C arrays should also work (C++14 auto return type will be even easier). – TemplateRex Jun 08 '14 at 16:11
0

Consider something like this:

#include <iostream>

class Foo{
    int a;
public:
    Foo(int a){this->a = a;}
    int getA(){return this->a;}
    void * extendedMethod(void *(*func)(int, char **, Foo*), int argc, char **argv){
        return func(argc, argv, this);
    }

};

void * extendFooWith(int argc, char **argv, Foo* self){
    /* You can call methods on self... but still no access to private fields */
    std::cout << self->getA();
    return self;
}

int main(int argc, char const *argv[])
{
    Foo foo(5);
    foo.extendedMethod(extendFooWith, 0 /*argc*/, NULL /*argv*/);
    return 0;
}

That's the best way I thought of extending a class with a method. The only way to access private fields would be from inside extendedMethod() i.e. something like this is possible: return func(this->a, argv, this); but then it is not that generic any more. One way to improve it could be checking inside extendedMethod() what kind of pointer was passed and according to it access the private fields you are interested in and pass those to func(), but this will require adding code to extendedMethod() for every other method you will extend your class with.

Alexandru Barbarosie
  • 2,952
  • 3
  • 24
  • 46
  • That would extend the function at runtime, and included boilerplate code, thank you though :) Moreover, I din't think this is the c++ way to do things. – Herbert Jun 08 '14 at 15:00
  • @Herbert true, but function pointers are not a bad practice ;) – Alexandru Barbarosie Jun 08 '14 at 15:10
  • function pointers + boiler plate code are; your solution is (for my taste) much to close to writing an interpreter in c++ in order to support syntax sugar. – Herbert Jun 08 '14 at 15:16
  • And in general I try to avoid pointers in c++, if (humanly) possible, I would use functors instead. – Herbert Jun 08 '14 at 15:17