3

Apologies if this is a stupid question, but I wasn't clear on why COM pointer arguments are typically cast as (void**) instead of (IUnknown**). And then sometimes IUnknown pointers are in fact used, like with IObjectWithSite::SetSite. Can anyone explain this?

GSerg
  • 76,472
  • 17
  • 159
  • 346
  • **TLDR**: Legacy. Because COM was also meant to work with C, not just C++. – selbie Dec 22 '19 at 00:28
  • @selbie if that's really your understanding of the reason, you should make it an answer. – Mark Ransom Dec 22 '19 at 03:14
  • The ability to declare polymorphic interface pointers only exists in C++. The COM runtime api (CoCreateInstance et al) is a C api, and therefore requires a C type. No choice but void*. Actually programming OLE in C is drastically impractical, so using IUnknown* in interfaces like IObjectWithSite is practical. – Hans Passant Dec 31 '19 at 10:13
  • @han: Mostly correct, but you are conflating `[in]` and `[out]` parameters. The latter require polymorphic interface pointers, so `void*` is the only option given a C ABI. In the former case, however, the interface *requires* an `IUnknown*`, not some polymorphic (in the language-level sense) interface pointer. This isn't about practicability. It's about correctness. – IInspectable Dec 31 '19 at 19:45

3 Answers3

2

In "get-type" interface methods (like IObjectWithSite::QueryInterface, IObjectWithSite::GetSite, IMoniker::BindToObject, etc...), because it wouldn't change anything, you'd have to cast anyway, except when you do require an IUnknown* reference, but you already have it because ... you're using it (the IUknown* reference is always the same pointer per COM rules).

IObjectWithSite::SetSite is a "set-type" method, so it makes more sense to give you an IUnknown* reference.

It's probably more arguable in some static methods like CoCreateInstance or CoGetObject I think they could have put IUnknown** there instead of void** but then, they would have two different styles. And you wouldn't be able to use the old IID_PPV_ARGS macro that's so practical to use, and recommended as a coding practice to avoid type cast errors.

I suggest you get a copy of the authoritative "Essential COM" from Don Box, and read up to page 60 (at least :-).

Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
  • 1
    *"you'd have to cast anyway"* - Except, you aren't allowed to cast in COM. If you need to move from one interface type to another, a call to `QueryInterface` is required. *"I think they could have put `IUnknown**` there instead of `void**`"* - That's wrong. It assumes that interface inheritance were always modeled using a language-level construct that implies a relationship between pointers to these interfaces. COM does not mandate this requirement, and a programming language like C certainly cannot implement this. – IInspectable Dec 25 '19 at 11:13
  • @IInspectable - I mean't you'd cast to void** anyway so QI compiles. And for the rest, you think what you want. – Simon Mourier Dec 25 '19 at 15:00
  • @IInspectable - sure, but what's the point? I could say "you'd have to cast in C++ anyway" if you like it more, but the question is about C++ (and COM has really its root in C++) and still, they could have used IUnknown* anyway. In fact, they sometimes did: https://learn.microsoft.com/en-us/windows/win32/api/objidl/nf-objidl-icallfactory-createcall – Simon Mourier Dec 28 '19 at 19:01
  • 2
    Being language-agnostic was one of COM's design goals. With that goal, the options for an ABI are limited to a single one in practice: C's. If anything, COM is rooted in C, certainly not in C++. Microsoft's C++ compiler decided to emit COM-compatible v-tables; i.e. Microsoft's C++ implementation followed COM rules, not the other way around. And you are still wrong: An out-parameter that can return *any* interface type, must return the interface pointer through a pointer to the *final* type. The fact that you discovered a buggy signature doesn't change that. Rationale can be found in my answer. – IInspectable Dec 29 '19 at 08:05
0

I guess you are talking about out-parameters that "return" a value via a pointer, e.g.:

IUnknown *u;
QueryInterface(IID_IUnknown, (void **)&u);

IDispatch *d;
QueryInterface(IID_IDispatch, (void **)&d);

Note: See here if you are unfamiliar with the concept of using pass-by-pointer to "return" values.


The prototype of QueryInterface is:

HRESULT QueryInterface(REFIID iid, void **ptr);

The argument &u would have type IUnknown ** already (so there is no reason to cast it to the type it already is, as you seem to be suggesting). The cast is to change the type to the expected type for the function parameter.

In Standard C++ the code above is actually undefined behaviour (strict aliasing violation -- you can not pretend a pointer points to a different type object than it really does). The correct usage of this function is:

void *temp;
QueryInterface(IID_IFoo, &temp);
IFoo *foo = static_cast<IFoo *>(temp);

However the Microsoft compiler supports the version with the C-style cast. It makes the same changes to the memory location as if the interface pointer had been declared as void * instead of its real type.


Why does QueryInterface take void ** and not IUnknown **? Well, that would be nice to avoid the cast if you are specifically requesting IID_IUnknown, however any other interface would require the C-style cast anyway. And it might cause confusion since (for other interfaces) the returned pointer is not necessarily a valid IUnknown * value .

In C++ programming you can (and perhaps should) use template wrapper classes that perform all the right type manipulations . The Windows API calls are compatible with C so they can not include strongly-typed generics .

NB. All calls to QueryInterface should check the return value, I omitted that here for brevity.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • There is actually a very important difference between ``void**`` and ``IUnknown**``, namely the type of cast required (in the line ``IFoo* foo = ...`` at the end). Using ``void**``, a ``static_cast`` or ``reinterpret_cast`` is sufficient, while using ``IUnknown**`` would either require a ``dynamic_cast`` (which would turn using QueryInterface quite useless), or the returend pointer would not be guaranteed to actually point to a valid ``IUnknown`` (especially if multiple and/or virtual inheritance is used in the object's implementation) – Bizzarrus Dec 21 '19 at 23:52
  • @Bizzarrus in both cases you have to use `reinterpret_cast` (and it is undefined behaviour in both cases anyway as covered in my answer). I agree it is probably less confusing to use `void **`. I've updated that paragraph – M.M Dec 21 '19 at 23:54
  • Sorry, I expressed myself in a confusing way. I meant to say what you have edited into your answer, my english just let me down ._. – Bizzarrus Dec 22 '19 at 00:16
  • I'm unclear what you mean by "not a valid `IUnknown*` value". All COM interfaces must inherit from `IUnknown`. – Jonathan Potter Dec 22 '19 at 00:34
  • 1
    @JonathanPotter In multiple inheritance the value of a pointer may differ from the value of a pointer to one of its bases. IOW `reinterpret_cast(d)` is not the same as `dynamic_cast(d)` . – M.M Dec 22 '19 at 00:36
  • I believe the rules of COM would prohibit this. *For any given object instance, a call to QueryInterface with IID_IUnknown must always return the same physical pointer value. This allows you to call QueryInterface on any two interfaces and compare the results to determine whether they point to the same instance of an object.* – Jonathan Potter Dec 22 '19 at 00:42
  • @JonathanPotter A call to QueryInterface with `IID_IDispatch` can give a different value however – M.M Dec 22 '19 at 00:42
  • Oh I understand what you mean now. Carry on! :) – Jonathan Potter Dec 22 '19 at 00:48
  • @JonathanPotter the *implementation* class can use multiple inheritance, and if it implements multiple classes/interfaces derived from `IUnknown` then it has to decide which one's `IUnknown` to report for `QueryInterface(IID_IUnknown)` – Remy Lebeau Dec 22 '19 at 01:38
  • *"In C++ programming you can (and perhaps should) use template wrapper classes that perform all the right type manipulations."* - Indeed, you should. The Windows SDK even provides one for you: [IID_PPV_ARGS](https://learn.microsoft.com/en-us/windows/win32/api/combaseapi/nf-combaseapi-iid_ppv_args). In addition to handling the type manipulations, it addresses 2 more issues: `1` It ensures, that the pointee derives from (or is) an `IUnknown`. `2` It infers the `IID` from the interface pointer type to avoid any mismatches. – IInspectable Dec 22 '19 at 09:20
  • @IInspectable OK although that documentation is a bit fruity; it's actually a preprocessor macro that expands to two comma-separated arguments; not a function as shown in the doc – M.M Dec 22 '19 at 09:33
  • Correct, it's a macro (as the title suggests). It expands to a comma-separated list of arguments. The second argument is the return value of a function template instantiation, that performs the type manipulations. I'm not aware of any C++ feature that allows us to expand an expression to multiple tokens, so it needs to be a macro, unless you are willing to give up benefit `2`. Which really is the one that provides actual value. – IInspectable Dec 22 '19 at 10:07
  • Sorry, but this answer fails to nail down the actual reason. This isn't about convenience, or avoiding casts. The reason is, because you aren't *allowed* to cast between interface pointers in COM. – IInspectable Dec 22 '19 at 15:12
  • Besides, I believe that the comment on undefined behavior isn't true. The strict aliasing rule applies to *accessing* data. We cannot see that code, but it presumably writes through a properly typed pointer. Indeed, this is even more reason to use a `void**` instead of an `IUnknown**` in the `QueryInterface` signature, as the latter would indeed violate the strict aliasing rule. If the implementation is written in a programming language that has a strict aliasing rule. – IInspectable Dec 22 '19 at 15:32
  • @IInspectable reading and writing are both covered by the rule, it is a violation to read or write an object via an lvalue of different type (except for the listed exceptions in the rules) – M.M Dec 22 '19 at 22:24
  • Sure, but you *aren't* reading or writing through a pointer of a different type. – IInspectable Dec 22 '19 at 23:11
  • @IInspectable The QI function will modify a variable of type `IFoo *` via expression of type `void *` (e.g. see this [sample implementation](https://learn.microsoft.com/en-us/office/client-developer/outlook/mapi/implementing-iunknown-in-c-plus-plus)) – M.M Dec 23 '19 at 02:57
  • This is making assumptions about the implementation of QI. A conforming C++ implementation of QI could write `(IFoo*)*ppv = this;`, thereby assigning a value through a pointer that matches the dynamic type of the variable. Your proposed *"correct usage"* just broke a conforming implementation. I suppose the common term for this technique is *type erasure*. Are you suggesting, that type erasure through a `void*` cannot be made to work in C++? – IInspectable Dec 23 '19 at 08:27
  • @IInspectable well `::QueryInterface` can't do that if `IFoo` was a class defined by the user since it doesn't know anything about that class. – M.M Dec 23 '19 at 08:33
  • QI knows *all* interface types it implements. If the user queries for an interface that a COM object doesn't implement, it is required to set `*ppv` to 0. Casting to `char*` and writing through this pointer is safe. This doesn't violate any more strict aliasing rules than - say - [std::qsort](https://en.cppreference.com/w/cpp/algorithm/qsort). – IInspectable Dec 23 '19 at 08:49
0

Important take-away

This has nothing to do with legacy or convenience. The function signatures are the result of COM fundamentals to allow it to work. They are required to be typed the way they are. If you don't feel like reading through this answer, here are the important take-aways: The moral equivalent of a cast in C++ is a call to QueryInterface in COM. The only time you are allowed to use a C++ cast in COM is when implementing your COM object's QueryInterface.

Details

Function signatures in COM that expect the address of a void* (as opposed to an IUnknown*) as an output parameter can return any interface type. If this were changed to an IUnknown** in a hypothetical implementation, it would make using COM either impractical, or right-out impossible.

Let's perform 2 thought experiments, starting with the one that would make COM impractical to use:

Let's assume that CoCreateInstance were to return an IUnknown* instead of the real interface requested through a void*. In this case, a client would have to immediately call QueryInterface on the returned IUnknown* to receive the interface pointer they had asked for1. This isn't practical2.

This immediately leads into the impossible to solve experiment. Let's assume that QueryInterface returned an IUnknown*. To get to the real interface pointer, a client would need to call QueryInterface. But that only returns an IUnknown*! At that point the consumable interface surface of COM has collapsed into a single interface, IUnknown.

Whenever COM returns a pointer to an interface, it must return a pointer to the final type. The only programming language type that matches all interface pointers is indeed a void*.3 This should explain, why output parameters need to be typed void** rather than IUnknown**.

For IObjectWithSite::SetSite, on the other hand, the IUnknown* is an input. The interface still accepts any COM interface, but it needs to be passed as a pointer to the (identity-comparable) IUnknown interface.


1 COM does not mandate a particular object layout for implementations. Instead, it delegates requests for interface pointers to the respective QueryInterface implementations.

2 Even when ignoring the immediate need to call Release() on the IUnknown to account for the bumped reference count as part of the QI call.

3 From The Component Object Model: "The only language requirement for COM is that code is generated in a language that can create structures of pointers and, either explicitly or implicitly, call functions through pointers." There is deliberately no requirement, that interface inheritance be implemented using language-level constructs. Even when IFoo is required to implement IUnknown, there is no relationship between IFoo* and IUnknown* in a programming language like C.

IInspectable
  • 46,945
  • 8
  • 85
  • 181