1

Given a base class B, such as:

    class B
    {
    public:
        virtual ~B() = default;
    public:
        virtual int f() const = 0;
    };

and a number of derived classes Ai: public B (i=1,..,N), implementing f(), I receive a void* definitely holding one of the derived Ai classes from the external program - to execute the f() method.

It's possible to create an entry point for each possible derived type, and it will work fine:

// for each derived class Ai
void executeF(void* aPtr, int* result)
{
    auto aObjPtr = static_cast<Ai*>(aPtr);
    *result = aObjPtr->f();
}

However, it should be possible to achieve the same result with single function only, such as:

void executeF(void* aPtr, int* result)
{
    auto bObjPtr = static_cast<B*>(aPtr); // works
    *result = bObjPtr->f(); // Access violation
}

The case succeeds in the above, but the execution of f() fails with "Access violation" in MSVC 2013.

Is there something wrong with the above function? And if so, is there a way to achieve the task with a single function?

I've read some materials, which claim that one has to cast void* only to the particular class, which it holds (also suggested in the comment below). However, this code compiles and executes fine: http://ideone.com/e0Lr6v

Some more context about how everything is being called:

I can't provide the entire code here because it's too long but in summary.. The function executeF, constructors for objects Ai and everything in the library that defines and operates on objects A, B are provided as exported functions that operate on void* types only. Just FYI, this library is being compiled and build with MSVC 2013.

Another side (the wrapper for R language) is compiled and built with g++ - it loads the above library dynamically, exports needed function and calls it. The only thing that is available on this side is the void* holding objects Ai - it just sends requests to create objects, calls their methods, frees them.

For example (schematically), create an object of type A1:

// "objects" library
void createA1(void** aObj)
{
    *a1Obj = new A1();
}

// caller library
auto const createA1func = (int(__CC *)(void**)) GetProcAddress(getDLL(), "CreateA1");
void* a1Obj = NULL;
createAFunc(a1Obj);
// ... return the a1Obj to the external environemnt to keep it around 

Then, having a1Obj around, do some job with it:

// caller library
auto const executeFfunc = (int(__CC *)(void*, int*)) GetProcAddress(getDLL(), "executeF");
int res(0);
executeFfunc(a1Obj, &res);

So if I write a separate function for each type Ai on both sides, everything works OK. But it'd be significantly less boilerplate code if I can use the base class here somehow.

Oleg Shirokikh
  • 3,447
  • 4
  • 33
  • 61
  • 1
    You could try casting before calling `executeF`, so that `executeF` always gets a `B*` instead of an `Ai*` – user253751 Feb 22 '17 at 02:10
  • Why is this tagged C? – user253751 Feb 22 '17 at 02:11
  • @immibis external program that passes the `void*` doesn't know about the types `B`, `A`, etc – Oleg Shirokikh Feb 22 '17 at 02:11
  • 7
    When casting from `void *`, it is only guaranteed to work if you cast back to the same type that the cast was from. Not a base or derived class of that. – M.M Feb 22 '17 at 02:13
  • @immibis i tagged C, because `executeF` is `extern C` function and called dynamically from the library by the external program. I omitted this info in the question, but perhaps it might be relevant – Oleg Shirokikh Feb 22 '17 at 02:13
  • 2
    Please don't tag both C and C++; it's very rare that both tags are useful. In this case, you're using a C++ compiler, and a C compiler would emit error messages that aren't a part of your question... thus your code is C++, and that's the tag you should use. – autistic Feb 22 '17 at 02:14
  • @M.M Please check this working code out: http://ideone.com/e0Lr6v – Oleg Shirokikh Feb 22 '17 at 02:14
  • `extern C` isn't a part of the C language, and doesn't cause the internals to become valid in C, either... Which book are you reading? – autistic Feb 22 '17 at 02:15
  • How are you *calling* `executeF`? Obviously the `static_cast` must only be used to reverse a prior implicit conversion... – Kerrek SB Feb 22 '17 at 02:16
  • Maybe this would provide a clue" `In contrast to dynamic_cast, no run-time check is made on the static_cast conversion`. Also, what type information do you get is you get the `typeid` of aPtr? – bruceg Feb 22 '17 at 02:16
  • @KerrekSB I load the DLL in the runtime, extract and call the function. The objects of derived `Ai` type are being created beforehand. The goal is to simply call `f()` method. I wonder if this can be achieved with a single function operating on the base class type – Oleg Shirokikh Feb 22 '17 at 02:19
  • @bruceg the cast itself succeeds and I can see the correct object when debugging. The call to the `f()` method fails with the Access Violation. – Oleg Shirokikh Feb 22 '17 at 02:21
  • Please post some code. I repeat the question: How are you calling `executeF`? – Kerrek SB Feb 22 '17 at 02:29
  • 2
    If you pass an `A1*` pointer as a `void*` to the DLL, you cannot cast the `void*` in `executeF()` to `B*` directly, you must cast it to `A1*`. If you pass multiple `A1*`, `A2*`, `A3*` etc pointers to the DLL, cast them all to `B*` first before passing them as `void*` into the DLL, then you can cast the `void*` in `executeF()` directly to `B*`. – Remy Lebeau Feb 22 '17 at 02:36
  • @RemyLebeau The side that passes `void*` to the DLL is not aware of any of `B`, `A` types – Oleg Shirokikh Feb 22 '17 at 03:18
  • 1
    @OlegShirokikh how **exactly** do the object pointers get into the DLL? How **exactly** does the DLL know to call `executeF()` with the object pointers? You are leaving out vital information. *Some* piece of code is creating the objects and converting them to `void*`. *That* code needs to cast them to `B*` before then casting `B*` to `void*`. – Remy Lebeau Feb 22 '17 at 07:42
  • @RemyLebeau i've added some context about how things are being called - hopefully, it clarifies the matter. thanks – Oleg Shirokikh Feb 22 '17 at 17:48
  • Why on god's green earth are you casting anything to `void*`? You have a function that needs a `B*`, pass a `B*` to it. – n. m. could be an AI Feb 22 '17 at 17:55
  • @n.m. because then the caller side will have to include all headers from and link to the objects library. This library is MSVC-based, so it's virtually impossible to compile it fine with g++ – Oleg Shirokikh Feb 22 '17 at 17:58
  • The question, for which I don't have an answer yet is why can't I pass `A*` and cast it to `B*` in this case - and why it works in the provided Ideone code – Oleg Shirokikh Feb 22 '17 at 18:00
  • Ideone code is not built with a mix of g++ and MSVC. Such a mix is unlikely to work unless you know extremely well what you are doing. Please indicate in your question which class and which function is built with which compiler. – n. m. could be an AI Feb 22 '17 at 18:17
  • @OlegShirokikh: When `A` derives from `B`, a pointer to the `A` portion of an object is (usually) not pointing at the same memory address as a pointer to the `B` portion of the object, especially if `B` has data fields in it. Accessing `A` methods via a `B*` pointer involves pointer fixups and vmt lookups and such that have to be taking into account correctly. That is why you can't simply cast `A*` -> `void*` -> `B*` and expect everything to work correctly. You must cast either as `A*` -> `void*` -> `A*` or as `A*` - > `B*` -> `void*` -> `B*` to ensure things stay lined up correctly. – Remy Lebeau Feb 22 '17 at 18:18
  • 1
    @OlegShirokikh You are trying to **reinterpret** an `A*` pointer as-is as if it were a `B*` pointer, and that simply does not work. – Remy Lebeau Feb 22 '17 at 18:20
  • Regardless of what was said, ideone code invokes undefined behaviour, which means you are lucky it runs as you expect and doesn't, for example, eat your dog. – n. m. could be an AI Feb 22 '17 at 18:23
  • @RemyLebeau This depends on the compiler and who knows what else. Sometimes there is a fixup, sometimes there is none. With g++, you mostly get zero fixup until you start fiddling with multiple base classes. – n. m. could be an AI Feb 22 '17 at 18:26
  • @n.m. exactly, it is compiler-dependant behavior, and can't be relied on. If you want to operate on a `B*`, you have to start with a `B*` to begin with, and let the compiler work out how it wants to access `A` via `B`. – Remy Lebeau Feb 22 '17 at 18:29
  • OK, thx guys. I'm getting an understanding that although it _might_ work under certain circumstances and a portion of luck even, it's still prone to an undefined behavior. – Oleg Shirokikh Feb 22 '17 at 18:33
  • 1
    Cast from Ai to B is potentially non-trivial and involves different object code for each Ai. You need to let the compiler generate all this different code *somewhere*. This means either a different piece of source code for each Ai, or a template instantiated for each Ai. – n. m. could be an AI Feb 22 '17 at 18:50

5 Answers5

2

When Ai derives from B, a pointer to the Ai portion of an object is (usually) not pointing at the same memory address as a pointer to the B portion of the same object (especially if B has data fields in it). Accessing Ai via a B* pointer usually involves pointer fixups, VMT lookups, etc, things that have to be taking into account by the particular compiler being used. That is why you can't simply cast an Ai* pointer to a void* to a B* and expect everything to work correctly. The B* pointer is not a valid B* pointer, it is actually an Ai* pointer that has been reinterpretted as a B*, and that simply does not work legally.

To ensure things stay lined up correctly, you must either:

  • cast Ai* to void*, and then void* to Ai*. Which is what you are trying to avoid.

  • cast Ai* to B* first, and then B* to void*, and then void* to B* (and then optionally B* to Ai* via dynamic_cast if you need to access non-virtual members of Ai).

So, in order for this to work the way you are wanting, do the following when constructing your objects:

void createA1(void** aObj)
{
    *aObj = static_cast<B*>(new A1());
}

void createA2(void** aObj)
{
    *aObj = static_cast<B*>(new A2());
}

And so on. This ensures that all pointers passed to executeF() are proper B* pointers, and only then can executeF() safely type-cast its received void* pointer to B* and use polymorphism to access whichever derived class it is actually pointing at:

void executeF(void* aPtr, int* result)
{
    B* bObjPtr = static_cast<B*>(aPtr);
    *result = bObjPtr->f(); // works
}

Update: Alternatively, especially when dealing with multiple derived classes that each have multiple base classes that may or may not all be shared, another option would be to simply wrap the Ai objects in a struct that has an extra field to indicate the object type. Then your create...() functions can return void* pointers to that struct instead of to the Ai objects directly, and the execute...() functions can first cast the void* to that struct, look at its type field, and cast the object pointers accordingly:

enum AType
{
    a1, a2 /*, ... */
};

class B
{
public:
    virtual ~B() = default;
    virtual int f() = 0;
};

class Bx
{
public:
    virtual ~B() = default;
    virtual int x() = 0;
};

class By
{
public:
    virtual ~B() = default;
    virtual int y() = 0;
};

// ...

class A1 : public B, public Bx
{
public:
    int f() override { return 1; }
    int x() override { return 1; }
};

class A2 : public B, public By
{
public:
    int f() override { return 2; }
    int y() override { return 2; }
};

// ...

struct objDesc
{
    AType type;
    void *obj;
};

void createA1(void** aObj)
{
    objDesc *desc = new objDesc;
    desc->type = a1;
    desc->obj = new A1();
    *aObj = desc;
}

void createA2(void** aObj)
{
    objDesc *desc = new objDesc;
    desc->type = a2;
    desc->obj = new A2();
    *aObj = desc;
}

// ...

void destroyObj(void* aObj)
{
    objDesc *desc = static_cast<objDesc*>(aObj);
    switch (desc->type)
    {
        case a1:
            delete static_cast<A1*>(desc->obj);
            break;

        case a2:
            delete static_cast<A2*>(desc->obj);
            break;

        //..
    }

    delete desc;
}

//...

void executeF(void* aPtr, int* result)
{
    objDesc *desc = static_cast<objDesc*>(aPtr);
    B* bObjPtr = nullptr;

    switch (desc->type)
    {
        case a1:
            bObjPtr = static_cast<A1*>(desc->obj);
            break;

        case a2:
            bObjPtr = static_cast<A2*>(desc->obj);
            break;

        // other classes that implement B ...
    }

    if (bObjPtr)
        *result = bObjPtr->f();
}

void executeX(void* aPtr, int* result)
{
    objDesc *desc = static_cast<objDesc*>(aPtr);
    Bx* bObjPtr = nullptr;

    switch (desc->type)
    {
        case a1:
            bObjPtr = static_cast<A1*>(desc->obj);
            break;

        // other classes that implement Bx ...
    }

    if (bObjPtr)
        *result = bObjPtr->x();
}

void executeY(void* aPtr, int* result)
{
    objDesc *desc = static_cast<objDesc*>(aPtr);
    By* bObjPtr = nullptr;

    switch (desc->type)
    {
        case a2:
            bObjPtr = static_cast<A2*>(desc->obj);
            break;

        // other classes that implement By ...
    }

    if (bObjPtr)
        *result = bObjPtr->y();
}

// ...

It is not ideal or flexible, but it will work within the restrictions you have on the other side.

Otherwise, you can replace the struct with a new base class that all other classes must derive from, then you can make use of dynamic_cast as needed:

class Base
{
public:
    virtual ~Base() = default;
};

class Bf
{
public:
    virtual ~Bf() = default;
    virtual int f() = 0;
};

class Bx
{
public:
    virtual ~Bx() = default;
    virtual int x() = 0;
};

class By
{
public:
    virtual ~By() = default;
    virtual int y() = 0;
};

class Bz
{
public:
    virtual ~Bz() = default;
    virtual int z() = 0;
};

class A1 : public Base, public Bf, public Bx
{
public:
    int f() override { return 1; }
    int x() override { return 1; }
};

class A2 : public Base, public Bf, public By
{
public:
    int f() override { return 2; }
    int y() override { return 2; }
};

class A3 : public Base, public Bz
{
public:
    int z() override { return 3; }
};

// ...

void createA1(void** aObj)
{
    *aObj = static_cast<Base*>(new A1());
}

void createA2(void** aObj)
{
    *aObj = static_cast<Base*>(new A2());
}

void createA3(void** aObj)
{
    *aObj = static_cast<Base*>(new A3());
}

// ...

void destroyObj(void* aObj)
{
    delete static_cast<Base*>(aObj);
}

//...

void executeF(void* aPtr, int* result)
{
    Base *base = static_cast<Base*>(aPtr);
    B* bObjPtr = dynamic_cast<B*>(base);
    if (bObjPtr)
        *result = bObjPtr->f();
}

void executeX(void* aPtr, int* result)
{
    Base *base = static_cast<Base*>(aPtr);
    Bx* bObjPtr = dynamic_cast<Bx*>(base);
    if (bObjPtr)
        *result = bObjPtr->x();
}

void executeY(void* aPtr, int* result)
{
    Base *base = static_cast<Base*>(aPtr);
    By* bObjPtr = dynamic_cast<By*>(base);
    if (bObjPtr)
        *result = bObjPtr->y();
}

void executeZ(void* aPtr, int* result)
{
    Base *base = static_cast<Base*>(aPtr);
    By* bObjPtr = dynamic_cast<Bz*>(base);
    if (bObjPtr)
        *result = bObjPtr->z();
}

//...
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • This would probably be the easiest fix, but why C-style casts? – n. m. could be an AI Feb 22 '17 at 18:28
  • I changed it to C++ casts. – Remy Lebeau Feb 22 '17 at 18:30
  • Thanks. Well, this will work, yes, but it will cut all `Ai` specific functionality. Also there're bunch of base classes other than `B`... Anyways, I gather that I'll have to do it for each `Ai` type – Oleg Shirokikh Feb 22 '17 at 18:30
  • @OlegShirokikh "there're bunch of base classes other than B." It's exactly the case when non-zero fixup is required. It is almost guaranteed not to work. – n. m. could be an AI Feb 22 '17 at 18:37
  • @OlegShirokikh "*it will cut all `Ai` specific functionality*" - no it doesn't. You still have full `Ai` objects in memory, you are just passing around pointers to their `B` portions. Polymorphism does the rest. If you need to access `Ai` functionality that is not virtual, use `dynamic_cast` to cast from `B*` to `Ai*`. "*Also there're bunch of base classes other than B*" - well, then your whole solution goes out the window. If you want to use a single function for multiple classes, they must have a common base class, and you must pass around pointers to that base class. – Remy Lebeau Feb 22 '17 at 18:41
  • @RemyLebeau yes, you're right, it won't cut derived class functionality, my bad... However, I can't simply cast to `B` during construction if there's more than one base class for `A`, right? – Oleg Shirokikh Feb 22 '17 at 18:45
  • @OlegShirokikh: "*I can't simply cast to `B` during construction if there's more than one base class for `A`, right?*" - even if `A` derives from other base classes, as long as it *also* derives from `B` then you most certainly can cast from `A*` to `B*`. The `static_cast` validates that at compile-time to make sure it is legal. So would using a local variable instead: `void createA1(void** aObj) { B* bPtr = new A1(); *aObj = bPtr; }` – Remy Lebeau Feb 22 '17 at 18:47
  • @RemyLebeau I mean the case if I have more functionality like `executeF` but in other base classes. Your solution will allow polymorphic execution of `B`'s virtual methods, but what about all other base classes? E.g. `executeX` from `Bx`, `executeY` from `By`, etc. - with `Bx`, `By`, etc all being base classes for `A` objects – Oleg Shirokikh Feb 22 '17 at 18:50
  • When multiple base classes are involved, you would not be able to pass the *same* `void*` pointer around to different `execute...()` functions. You would be right back in the same reinterpreting-X-as-Y trap. To make this work correctly, you would have to construct an `A` object first, then cast the *same object* to its various base pointers and then pass *those pointers* around to their respective `execute...()` functions as needed. Roughly speaking: `A1 *aObj = new A1(); B *bPtr = aObj; Bx *xPtr = aObj; By *yPtr = aObj; ... executeF(bPtr); executeX(xPtr); executeY(yPtr);` and so on... – Remy Lebeau Feb 22 '17 at 19:10
  • @RemyLebeau To make your last proposal to work, I'd have to cast on the caller library side right? - and it doesn't know the types. If I send `A*` to the library side and cast it to B* (`B *bPtr = aObj;`) and return `void*` of that back to the caller and then pass it back to the library to `executeF` I'll get the same problem? – Oleg Shirokikh Feb 22 '17 at 19:53
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/136406/discussion-between-remy-lebeau-and-oleg-shirokikh). – Remy Lebeau Feb 23 '17 at 00:44
  • @RemyLebeau Thanks very much for all your explanations and examples. I think you provided a comprehensive answer to the question! The rest is my specifics, and since I can't post the whole thing here I'll have to figure it out starting from here. Thanks again - learned a lot – Oleg Shirokikh Feb 23 '17 at 02:03
1

The behavior you observe simply means that conversion from Ai * to B * is not purely conceptual, but actually requires a physical change in pointer value. In typical implementations this usally happens when:

  1. Class B is not polymorphic and contains subobjects of non-zero size, while class Ai is polymorphic. (Not your case)
  2. Class Ai has multiple bases and B is just one of them.

My guess would be that you are dealing with the second case in your code.

In such cases it might "work" if you ensure that the base B is the very first base of Ai (but, again, this is heavily implementation-dependent and, obviously, unreliable).

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
0

I came up with the following working solution, which avoids having 2*N functions on both sides, where N is the number of derived A classes. Instead, it involves 2 functions one on each side. The object library has a switch with N cases which casts the void* to the appropriate class. Note that the g++ side does need to be aware of the enum only and still doesn't know anything about types.

Not sure if it's perfect approach, but looks pretty concise and safe. Still interested in other solutions/comments.

http://ideone.com/enNl3f

enum AType
{
    a1 = 1, a2
};

class B
{
public:
    virtual ~B() = default;
public:
    virtual int f() const = 0;
};

class A1: public B
{
    virtual int f() const override
    {
        return 1;
    }
};
class A2: public B
{
    virtual int f() const override
    {
        return 2;
    }
};

void executeF(void* aPtr, AType aType, int* result)
{
    B* bPtr = nullptr;
    switch(aType)
    {
        case a1:
            bPtr = static_cast<A1*>(aPtr);
            break;
        case a2:
            bPtr = static_cast<A2*>(aPtr);
            break;
        default:
            break;
    }

    if(bPtr)
        *result = bPtr->f();
}
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Oleg Shirokikh
  • 3,447
  • 4
  • 33
  • 61
  • While this will work, it is not ideal. At the very least, I would get rid of the `aType` parameter on `executeF()` and wrap the `Ai` objects in a `struct` that has an `AType` field to specify the object type. Then have your `create...()` functions return pointers to that struct instead of pointers to the `Ai` objects themselves. This will allow the `execute...()` functions to look at the AType` struct field and cast accordingly. I will update my answer with an example. – Remy Lebeau Feb 23 '17 at 00:58
  • I can see how the extra parent structure can replace the `enum`, but this will involve a big rewrite to the existing code: 1. inheriting every A class from this new base struct 2. modifying each ctor to upcast the created A object While with `enum` the existing code stays untouched. What's the benefit of that over `enum` – Oleg Shirokikh Feb 23 '17 at 01:04
  • What I propose is not a new base type that the classes have to directly derive from. It is simply a wrapper that allows the `create...()` functions to pass that extra `enum` to the `execute...()` functions without changing the existing classes at all. See the update to my answer. – Remy Lebeau Feb 23 '17 at 01:07
  • OK, gotcha. So why this is beneficial to `enum` approach? – Oleg Shirokikh Feb 23 '17 at 01:07
  • It is a wrapper that **uses** the enum, not avoids it. The purpose is to let the `void*` pointers point to something that carries the `enum` so the caller does not have to worry about keeping track of what types the `void*` pointers are pointing at so it can pass that info to the functions manually. This lets the `void*` pointers handle that transparently on the caller's behalf. – Remy Lebeau Feb 23 '17 at 01:09
  • I see your point - indeed it simplifies things. I'm thinking about one thing: now `objDescr` has to be visible on both sides.. – Oleg Shirokikh Feb 23 '17 at 01:24
  • No, it doesn't. Everything I have shown you is on one side only. The other side only knows about `void*`, just like you asked for. – Remy Lebeau Feb 23 '17 at 01:31
0

Imagine if there are two types, Base1 and Base2. Say Base1 contains only one member, an integer. And say Base2 contains only one member, a float.

One would expect that a Base1* will point to the integer and a Base2* will point to the float.

Now, consider:

class Derived : public Base1, public Base2
{
    ...

Now, if we cast a Derived* to a void*, we can get a pointer to the integer in the Base1 or we could get a pointer to the float in the Base2. But we cannot possibly get both.

Thus the expectation that you can convert a Derived* to a void* and then cast it back to a pointer to a base class and get something sensible is asking for the impossible. Converting a pointer to a base class into a pointer to a class it is derived from must sometimes change the value of that pointer.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
-2

The function executeF, constructors for objects Ai...

That is most likely the issue, you shall not call virtuals in the constructor. It works for Ai because Ai is not calling the virtual method from the vptr table. B however has no such table yet if it's being constructed. See this other SO answer.

Community
  • 1
  • 1
Sergio Basurco
  • 3,488
  • 2
  • 22
  • 40