6

I have looked at the following — related — questions, and none of them seem to address my exact issue: one, two, three.

I am writing a collection of which the elements (key-value pairs) are stored along with some bookkeeping information:

struct Element {
    Key key;
    Value value;
    int flags;
};

std::vector<Element> elements;

(For simplicity, suppose that both Key and Value are standard-layout types. The collection won't be used with any other types anyway.)

In order to support iterator-based access, I've written iterators that override operator-> and operator* to return to the user a pointer and a reference, respectively, to the key-value pair. However, due to the nature of the collection, the user is never allowed to change the returned key. For this reason, I've declared a KeyValuePair structure:

struct KeyValuePair {
    const Key key;
    Value value;
};

And I've implemented operator-> on the iterator like this:

struct iterator {
    size_t index;

    KeyValuePair *operator->() {
        return reinterpret_cast<KeyValuePair *>(&elements[index]);
    }
};

My question is: is this use of reinterpret_cast well-defined, or does it invoke undefined behavior? I have tried to interpret relevant parts of the standard and examined answers to questions about similar issues, however, I failed to draw a definitive conclusion from them, because…:

  • the two struct types share some initial data members (namely, key and value) that only differ in const-qualification;
  • the standard does not explicitly say that T and cv T are layout-compatible, but it doesn't state the converse either; furthermore, it mandates that they should have the same representation and alignment requirements;
  • Two standard-layout class types share a common initial sequence if the first however many members have layout-compatible types;
  • for union types containing members of class type that share a common initial sequence, it is permitted to examine the members of such initial sequence using either of the union members (9.2p18). – there's no similar explicit guarantee made about reinterpret_casted pointers-to-structs sharing a common initial sequence. – it is, however, guaranteed that a pointer-to-struct points to its initial member (9.2p19).

Using merely this information, I found it impossible to deduce whether the Element and KeyValuePair structs share a common initial sequence, or have anything other in common that would justify my reinterpret_cast.

As an aside, if you think using reinterpret_cast for this purpose is inappropriate, and I'm really facing an XY problem and therefore I should simply do something else to achieve my goal, let me know.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • Are you asking for a [tag:language-lawyer] judgement? Does it work actually? `reinterpret_cast` is almost always the wrong approach, shouldn't a `const_cast` work well for what you're claiming? – πάντα ῥεῖ Oct 23 '15 at 22:31
  • @πάνταῥεῖ yes, I'm asking for a language-lawyer judgement. **I cannot possibly tell whether it actually works** or it just "appears to be working" since I don't know whether it invokes undefined behavior. I don't know whether `const_cast` should work in this case, but let me try it – it's still not clear, however, whether `const_cast` compiling would mean that this is well-defined. – The Paramagnetic Croissant Oct 23 '15 at 22:35
  • _"however, whether const_cast compiling would mean that this is well-defined."_ Of course, I think that's the purpose of it. Using a `reinterpret_cast` can't guarantee anything. – πάντα ῥεῖ Oct 23 '15 at 22:36

1 Answers1

6

My question is: is this use of reinterpret_cast well-defined, or does it invoke undefined behavior?

reinterpret_cast is the wrong approach here, you're simply violating strict aliasing. It is somewhat perplexing that reinterpret_cast and union diverge here, but the wording is very clear about this scenario.

You might be better off simply defining a union thusly:

union elem_t {
   Element e{}; KeyValuePair p;
   /* special member functions defined if necessary */
};

… and using that as your vector element type. Note that cv-qualification is ignored when determining layout-compability - [basic.types]/11:

Two types cv1 T1 and cv2 T2 are layout-compatible types if T1 and T2 are the same type, […]

Hence Element and KeyValuePair do indeed share a common initial sequence, and accessing the corresponding members of p, provided e is alive, is well-defined.


Another approach: Define

struct KeyValuePair {
    Key key;
    mutable Value value;
};

struct Element : KeyValuePair {
    int flags;
};

Now provide an iterator that simply wraps a const_iterator from the vector and upcasts the references/pointers to be exposed. key won't be modifiable, but value will be.

Columbo
  • 60,038
  • 8
  • 155
  • 203