5

I'm in the process of adapting some C++03 code to take advantage of the new possibilities of C++11, notably to introduce move semantics in the C++11 way. But I come across a struct where this is causing me a headache, because it contains an anonymous union. It has the global form

struct tree
{ // data of |tree|
  enum {tag0,tag1, tag2, ... } kind;
  union
  { type1 field1;
    type2 field2;
    ...
  };

  // constructors
  tree () : kind(tag0) {} // default, empty state
  tree (const& type1 x) : kind(tag1), field1(x) {} // variant 1
  tree (const& type2 x) : kind(tag2), field2(x) {} // variant 2
  ...
  ~tree(); // recursively clean up any branches

  // other methods of |tree|
  ...
}; // |struct tree|

with of course more meaningful names, but which are not to the point here. The struct represents a kind of parse tree, a multiply recursive structure that may branch out in different ways at each node, whence the use of a union. So many of the member types actually contain pointers to another tree; currently they are raw pointers (basically because C++03 would not allow anything but POD types in a union) that are managed by the containing tree, but now that C++11 no longer has such a restriction, I certainly plan to replace them by more structured values (using smart pointers).

(Please don't start moralising me that I should not be using a union in the first place; I think this is actually a more natural solution for the problem at hand than one based on polymorphism, and in any case there is a sufficiently large body of code using it that I don't want to change the approach radically.)

The code actually already defines a limited form of move semantics (in order to avoid deep copies having to be made all the time while bottom-up constructing a tree), in the form of a set_from method that will copy, into a default-constructed (so empty) tree structure, the top-level from another such structure passed by reference (using a switch to copy the proper active of the union). After this it sets kind=tag0 in the other so that any sharing of branches with the copy is prevented. It is not difficult to transform this method into a move constructor C++11 style.

However it might be nice to also have a move assignment operator. I'd like to use the copy-and-swap idiom which requires me to write a swap operation. Unlike what was suggested in this question, I think using std::swap is out of the question, since that would be using the move assignment operator operator for tree that I'm trying to define; a chicken-and-egg situation. So I had better write something myself.

The difficulty is that since two nodes my be of different variants, there is no question of calling swap on the fields. Copying or swapping as a whole just the union component of two nodes cannot be done since (1) the union component does not itself know which variant is active, and (2) the union is anonymous, so I cannot even declare such operations. It has to be at the level of the tree struct that swapping is to be defined. A memcpy based solution would currently be possible though ugly, but it won't even be possible once fields of the union will no longer be POD types.

So I am a bit out of good ideas. I can think of a solution that does a two-level switch on the active variants of both nodes, but I'm not thrilled at the prospect. One other thing that comes to mind is interchange roles, and define move assignment directly (which is not too different from move construction, though it requires the destination to be set to a default state while destroying any descendants, as if in calling the destructor, before moving in the source), and then implement swap in the traditional way using a temporary, one move construction and two move assignments. Ironically in this solution the move assignments are into nodes that are guaranteed to be in their empty variant, but the move assignment has to deal with the more general situation anyway. I'm note sure whether this solution would have strong exception safety.

Is there any easier solution that I am overlooking?

Community
  • 1
  • 1
Marc van Leeuwen
  • 3,605
  • 23
  • 38
  • *"I certainly plan to replace them by more structured values (using smart pointers)"* Well you have to call/write the destructors manually anyway for a union containing members with non-trivial destructors. – dyp Jun 29 '14 at 15:11
  • 1
    could you please clarify your question a bit more, and show us some code that you would like to work but currently doesn't. Otherwise this could be closed as "unclear what you are asking" – TemplateRex Jun 29 '14 at 15:19
  • 1
    You could look up how `boost::variant` does it. I guess you could implement it easily with a copy ctor, explicit destructor call and then switch + inplace-new. An optimization is to check if the types are the same and swap the members if that's the case. – dyp Jun 29 '14 at 15:21
  • @TemplateRex: For code that you would like to work but currently doesn't: just given `tree a,b`, I would like to be able to say `a=std::move(b);`. I currently cannot because there is no move-assignment operator, which is what I am asking about. – Marc van Leeuwen Jun 29 '14 at 15:26
  • @dyp (first comment): The smart pointers will of course handle their own destruction; the union can define an empty destructor, since the real destructing is being done from the `tree` destructor (which has a switch on `kind` to do the right thing). Thinking of it, that is going to be a real problem: what is the name of the destructor of an anonymous union? – Marc van Leeuwen Jun 29 '14 at 15:30
  • @MarcvanLeeuwen I don't see how the smart pointers could handle anything themselves: You have to call their destructors explicitly (in the switch). OTOH, inserting an element with a non-trivial dtor *forces* you to write a destructor for `tree`. – dyp Jun 29 '14 at 15:32
  • The members of the anonymous union become members of the enclosing class for most (all?) purposes. For all I know, the union itself does not have (de facto?) any special member functions. – dyp Jun 29 '14 at 15:36
  • "empty state" doesn't make sense. You have to pick one union member that is the first active one. – Kerrek SB Jun 29 '14 at 15:50
  • @dyp or even better, refactor this code in terms of `boost::variant` – TemplateRex Jun 29 '14 at 15:52
  • The things in the `union` are POD; what's wrong with, say, a bytewise swap? – tmyklebu Jun 29 '14 at 17:04
  • Looking rapidly at `boost::variant`, this seems not at all a good solution for my problem. It looks more like a polymorphism based solution, where every function wanting to do anything with the union has to decline itself into as many versions as there are variants. This does not match at all the current organisation of my code; I see for instance no easy way to test if two instances of a `boost::variant` currently are of the same kind; this kind of test is frequent in my code. – Marc van Leeuwen Jun 29 '14 at 17:36
  • @MarcvanLeeuwen `v.which()` will give the index of the current type, this allows you to make comparisons – TemplateRex Jun 29 '14 at 19:32
  • btw, don't you currently also have a constructor for every possible type in the union? and similarly for other member functions? – TemplateRex Jun 29 '14 at 19:42
  • @TemplateRex As code above shows, there is indeed one constructor for each variant, plus a default constructor for "no variant at all" (maybe I should add an explicit "empty" variant to represent that, to comply with the C++ rules stating that a union variable must always be in a definite variant, even if there is no way to find out which). But no other member functions or external functions are multiplied like that; instead they do a `switch` on the `kind` field (the destructor for `tree` is one example). I could break every such switch into a bunch of visitor functions, but don't want to. – Marc van Leeuwen Jun 30 '14 at 07:42

1 Answers1

4

So don't use copy and swap for move assignment. set_from by rights ought to be nothrow, since the whole point of it is to avoid allocating resources. I'll assume that it's nothrow or can be made to be.

If you don't have a clear() function already then implement one using code taken from the destructor (which frees resources) or from set_from (which puts the source into a clear or clear-ish state) and offering the strong exception guarantee. Then this move assignment offers the strong exception guarantee:

tree &operator=(tree &&other) {
    clear();
    set_from(other);
    return *this;
}

You need that move construction and assignment both leave other in a state where it's OK to call set_from on it, so either make sure of that or else do some extra work after calling set_from. other.clear() should do it but might be overkill.

Everything else follows nicely. std::swap will work as soon as you have working move construction and assignment, but might be slightly below optimal. You could perhaps improve on it with the knowledge that (as you've already identified) the targets of the moves used by std::swap don't need to be cleared because they've just been moved-from themselves and your move implementation left them already clear.

You might also feel like optimizing the case where kind == other.kind, so all that's needed is a swap of the relevant field.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699