3

I have a C library that has types like this:

typedef struct {
  // ...
} mytype;

mytype *mytype_new() {
  mytype *t = malloc(sizeof(*t));
  // [Initialize t]
  return t;
}

void mytype_dosomething(mytype *t, int arg);

I want to provide C++ wrappers to provide a better syntax. However, I want to avoid the complication of having a separately-allocated wrapper object. I have a relatively complicated graph of objects whose memory-management is already more complicated than I would like (objects are refcounted in such a way that all reachable objects are kept alive). Also the C library will be calling back into C++ with pointers to this object and the cost of a new wrapper object to be constructed for each C->C++ callback (since C doesn't know about the wrappers) is unacceptable to me.

My general scheme is to do:

class MyType : public mytype {
 public:
   static MyType* New() { return (MyType*)mytype_new(); }
   void DoSomething(int arg) { mytype_dosomething(this, arg); }
};

This will give C++ programmers nicer syntax:

// C Usage:
mytype *t = mytype_new();
mytype_dosomething(t, arg);

// C++ Usage:
MyType *t = MyType::New();
t->DoSomething(arg);

The fib is that I'm downcasting a mytype* (which was allocated with malloc()) to a MyType*, which is a lie. But if MyType has no members and no virtual functions, it seems like I should be able to depend on sizeof(mytype) == sizeof(MyType), and besides MyType has no actual data to which the compiler could generate any kind of reference.

So even though this probably violates the C++ standard, I'm tempted to think that I can get away with this, even across a wide array of compilers and platforms.

My questions are:

  1. Is it possible that, by some streak of luck, this does not actually violate the C++ standard?
  2. Can anyone think of any kind of real-world, practical problem I could run into by using a scheme like this?

EDIT: @James McNellis asks a good question of why I can't define MyType as:

class MyType {
 public:
  MyType() { mytype_init(this); }
 private:
  mytype t;
};

The reason is that I have C callbacks that will call back into C++ with a mytype*, and I want to be able convert this directly into a MyType* without having to copy.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
Josh Haberman
  • 4,170
  • 1
  • 22
  • 43
  • 3
    I don't have any sort of proof besides experience-motivated gut feeling but I recall DirectX math library prior to version 11 (D3DX) did something like that with its vector and matrix classes. `D3DVECTOR3`, for example, was a struct to be used in C functions, and `D3DXVECTOR3` (notice the `X`) was a C++ class inheriting from it to provide operator overloading. All these vectors were then passed to GPU which obviously cares a lot about how big the structures are and how they are aligned in memory - and everything worked without problems. – Xion Aug 24 '11 at 21:46
  • Thanks @Xion, prior art here is much appreciated! – Josh Haberman Aug 24 '11 at 22:04
  • I edited the question to include the requirement you didn't mention originally. I think @In silico's answer is perfectly fine for a wrapper that is "basically syntactic sugar". Since that wasn't really what you wanted, you shouldn't have said so. – R. Martinho Fernandes Aug 24 '11 at 22:07
  • 1
    "Syntactic sugar" is a nicer syntax that is completely equivalent to a long-form construct. It's not syntactic sugar if it adds memory management complexity. – Josh Haberman Aug 24 '11 at 22:38
  • Suppose each MyType is a stack-allocated object that wraps around the a my_type pointer. In that case, the cost of creating/passing/assigning/returning/copying should be identical to the cost of creating a 4-byte variable (or 8-byte if you're running on a 64bit machine) and assigning a value to it. Can't your application tolerate such a small tiny performance penalty? – Itay Maman Aug 24 '11 at 22:55

4 Answers4

5

You're downcasting a mytype* to a MyType*, which is legal C++. But here it's problematic since the mytype* pointer doesn't actually point to a MyType. It actually points to a mytype. Thus, if you downcast it do a MyType and attempt to access its members, it'll almost certainly not work. Even if there are no data members or virtual functions, you might in the future, and it's still a huge code smell.

Even if it doesn't violate the C++ standard (which I think it does), I would still be a bit suspicious about the code. Typically if you're wrapping a C library the "modern C++ way" is through the RAII idiom:

class MyType
{
public:
   // Constructor
    MyType() : myType(::mytype_new()) {}
   // Destructor
   ~MyType() { ::my_type_delete(); /* or something similar to this */ }


   mytype* GetRawMyType() { return myType; }
   const mytype* GetConstRawMyType() const { return myType; }

   void DoSomething(int arg) { ::mytype_dosomething(myType, int arg);  }

private:
    // MyType is not copyable.
    MyType(const MyType&);
    MyType& operator=(const MyType&);

    mytype* myType;
};

// Usage example:
{
    MyType t; // constructor called here
    t.DoSomething(123);   
} // destructor called when scope ends
In silico
  • 51,091
  • 10
  • 150
  • 143
  • 1
    Which of the two questions does this answer? – Rob Kennedy Aug 24 '11 at 21:44
  • None. It's useful, though. +1. In my view, it makes both questions pretty much irrelevant, because this was motivated by the desire to have some "syntactic sugar". – R. Martinho Fernandes Aug 24 '11 at 21:45
  • I want to avoid the complication of having a separately-allocated wrapper object. I have a relatively complicated graph of objects whose memory-management is already more complicated than I would like (objects are refcounted in such a way that all reachable objects are kept alive). Also the C library will be calling back into C++ with pointers to this object: your solution requires a new wrapper object to be constructed for each C->C++ callback (since C doesn't know about the C wrappers) -- this cost is unacceptable to me. – Josh Haberman Aug 24 '11 at 21:50
  • Why does the `my_type` object need to be dynamically allocated? Why can it not be declared as a member variable of `MyType`? – James McNellis Aug 24 '11 at 21:54
  • I can control this class, and I can guarantee that it never has data members or virtual functions. I would have no reason to add data members or virtual functions because the only point of this class is to offer a wrapper around the C functionality. – Josh Haberman Aug 24 '11 at 21:55
  • Because the C library allocates `my_type` itself with `malloc`, @James. – Rob Kennedy Aug 24 '11 at 21:55
  • @James McNellis: Because the C library interface exposes a `mytype_new()`. The fact that it uses `malloc()` is an implementation detail. – In silico Aug 24 '11 at 21:57
  • @Josh Haberman: What is the C library callback interface like? Does `mytype` already implement reference counting semantics, or is that something provided by a C++ wrapper? – In silico Aug 24 '11 at 21:58
  • 1
    @In silico: I'm not really looking for design input here, I'm mainly trying to get answers to the two questions that I asked. If you find it distasteful you're free to avoid using my library. – Josh Haberman Aug 24 '11 at 22:01
  • @Josh Haberman: Whether or not I find the library distasteful is completely irrelevant. I ask so I can give you a better answer that works with what you have, not so I can critique your design. – In silico Aug 24 '11 at 22:03
  • 1
    @In Silico: Interestingly, the cast he wants is actually still well-defined. It's defined in both C and C++ to convert a pointer to a struct to a pointer to the first member by a simple type cast on the pointer. Therefore, logically, it's still valid to access that as long as you don't access any of the other members- even if the struct isn't of that type, really. – Puppy Aug 24 '11 at 22:13
  • The C object is already involved in a reference-counting scheme that involves a graph of objects (a refcount on mytype can keep the entire object graph alive). The goal of the C++ wrapper is to be basically invisible: it gives you nicer syntax but is otherwise functionally identical to using the C interface directly. It allocates no extra objects and imposes no extra overhead. – Josh Haberman Aug 24 '11 at 22:16
  • @DeadMG: if you can flesh out your answer (and especially if you can provide C++ standard references) I'd be happy to accept your answer! – Josh Haberman Aug 24 '11 at 22:17
  • @Josh: It's In silico's answer. All I'm saying is that it doesn't actually make it impossible to do what you want in a well-defined way at all and is actually perfectly viable, despite the requirement you didn't mention. – Puppy Aug 24 '11 at 22:24
  • 1
    @Josh Haberman: In this *specific* case then, what you have is fine. I wouldn't advocate it in the general case, but as long as you know exactly what you're doing it may be a reasonable approach. – In silico Aug 24 '11 at 22:39
  • Please note that casting the pointer to the first member is only valid for POD types in C++03, but this type is not a C++03 POD. It is valid for standard-layout types in C++11, and this type is a standard-layout type. – R. Martinho Fernandes Aug 24 '11 at 22:45
3

I think it would be much safer and elegant to have a mytype* data member of MyType, and initialize it in the constructor of MyType rather than having a New() method (which, by the way, has to be static if you do want to have it).

Daniel Hershcovich
  • 3,751
  • 2
  • 30
  • 35
3

Is it possible that, by some streak of luck, this does not actually violate the C++ standard?

I'm not advocating this style, but as MyType and mytype are both PODs, I believe the cast does not violate the Standard. I believe MyType and mytype are layout-compatible (2003 version, Section 9.2, clause 14: "Two POD-struct ... types are layout-compatible if they have the same number of nonstatic data members, and corresponding nonstatic data members (in order) have layout-compatible types (3.9)."), and as such can be cast around without trouble.

EDIT: I had to test things, and it turns out I'm wrong. This is not Standard, as the base class makes MyType non-POD. The following doesn't compile:

#include <cstdio>

namespace {
    extern "C" struct Foo {
        int i;
    };
    extern "C" int do_foo(Foo* f)
    {
        return 5 + f->i;
    }

    struct Bar : Foo {
        int foo_it_up()
        {
            return do_foo(this);
        }
    };
}

int main()
{
    Bar f = { 5 };
    std::printf("%d\n", f.foo_it_up());
}

Visual C++ gives the error message that "Types with a base are not aggregate." Since "Types with a base are not aggregate," then the passage I quoted simply doesn't apply.

I believe that you're still safe in that most compilers will make MyType layout-compatible with with mytype. The cast will "work," but it's not Standard.

Max Lybbert
  • 19,717
  • 4
  • 46
  • 69
  • Thanks Max! It's alright if you don't personally like it. :) – Josh Haberman Aug 24 '11 at 22:22
  • 2
    POD-structs are aggregates (§9/4), and aggregates have no base classes (§8.5.1/1), but `MyType` has a base class (`mytype`), so `MyType` is not a POD. – Rob Kennedy Aug 24 '11 at 22:25
  • There is a reason to `malloc` the memory: these objects are refcounted. – Josh Haberman Aug 24 '11 at 22:28
  • 3
    We have a [FAQ on aggregates and PODs](http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special/4178176#4178176). This answer is wrong in stating that these are both PODs in C++03. Note that the rules are relaxed for C++11, and they are indeed both PODs in C++11. However, implementations of C++11 are still incomplete. – R. Martinho Fernandes Aug 24 '11 at 22:30
  • @Rob Kennedy: in that case would it be safer to declare `mytype` as a *member* of `MyType` instead of deriving from it? I would prevent direct construction of the object by making the constructor private. – Josh Haberman Aug 24 '11 at 22:31
  • 3
    I don't think it would be any safer. Making it a member variable gets around the "no base classes" requirement, but in exchange, you'll violate the "no user-defined constructors" requirement. Your question ultimately asks not whether your code is safe (i.e., blessed by the standard), but whether it's *safe enough*. And I think the answer is *yes*. You were hoping that your code might actually be standard, but it seems you're not quite that lucky today. – Rob Kennedy Aug 24 '11 at 22:41
  • I've edited the answer, because, as mentioned, it's not Standard under C++ 2003. – Max Lybbert Aug 24 '11 at 23:07
-1

It does violate the c++ standard, however it should work on most (all that I know) compilers .
You're relying on a specific implementation detail here (that the compiler doesn't care what the actual object is, just what is the type you gave it), but I don't think any compiler has a different implementation detail. be sure to check it on every compiler you use, it might catch you unprepared.

Daniel
  • 30,896
  • 18
  • 85
  • 139