49

I have, on more than one occasion, advised people to use a return value of type WideString for interop purposes.

The idea is that a WideString is the same as a BSTR. Because a BSTR is allocated on the shared COM heap then it is no problem to allocate in one module and deallocate in a different module. This is because all parties have agreed to use the same heap, the COM heap.

However, it seems that WideString cannot be used as a function return value for interop.

Consider the following Delphi DLL.

library WideStringTest;

uses
  ActiveX;

function TestWideString: WideString; stdcall;
begin
  Result := 'TestWideString';
end;

function TestBSTR: TBstr; stdcall;
begin
  Result := SysAllocString('TestBSTR');
end;

procedure TestWideStringOutParam(out str: WideString); stdcall;
begin
  str := 'TestWideStringOutParam';
end;

exports
  TestWideString, TestBSTR, TestWideStringOutParam;

begin
end.

and the following C++ code:

typedef BSTR (__stdcall *Func)();
typedef void (__stdcall *OutParam)(BSTR &pstr);

HMODULE lib = LoadLibrary(DLLNAME);
Func TestWideString = (Func) GetProcAddress(lib, "TestWideString");
Func TestBSTR = (Func) GetProcAddress(lib, "TestBSTR");
OutParam TestWideStringOutParam = (OutParam) GetProcAddress(lib,
                   "TestWideStringOutParam");

BSTR str = TestBSTR();
wprintf(L"%s\n", str);
SysFreeString(str);
str = NULL;

TestWideStringOutParam(str);
wprintf(L"%s\n", str);
SysFreeString(str);
str = NULL;

str = TestWideString();//fails here
wprintf(L"%s\n", str);
SysFreeString(str);

The call to TestWideString fails with this error:

Unhandled exception at 0x772015de in BSTRtest.exe: 0xC0000005: Access violation reading location 0x00000000.

Similarly, if we try to call this from C# with p/invoke, we have a failure:

[DllImport(@"path\to\my\dll")]
[return: MarshalAs(UnmanagedType.BStr)]
static extern string TestWideString();

The error is:

An unhandled exception of type 'System.Runtime.InteropServices.SEHException' occurred in ConsoleApplication10.exe

Additional information: External component has thrown an exception.

Calling TestWideString via p/invoke works as expected.

So, use pass-by-reference with WideString parameters and mapping them onto BSTR appears to work perfectly well. But not for function return values. I have tested this on Delphi 5, 2010 and XE2 and observe the same behaviour on all versions.

Execution enters the Delphi and fails almost immediately. The assignment to Result turns into a call to System._WStrAsg, the first line of which reads:

CMP     [EAX],EDX

Now, EAX is $00000000 and naturally there is an access violation.

Can anyone explain this? Am I doing something wrong? Am I unreasonable in expecting WideString function values to be viable BSTRs? Or is it just a Delphi defect?

Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 6
    David, Maybe add `C++`, `C#` tags also? – kobik Feb 20 '12 at 10:37
  • @kobik I believe that it's really a question about how Delphi implements return values. I think Delphi is the odd one out. – David Heffernan Feb 20 '12 at 10:38
  • @J... I've never seen a COM method that didn't return an `HRESULT`. I'm not talking about using BSTR in COM though. I'm talking about it as a convenient way to share a heap between different modules. – David Heffernan Feb 20 '12 at 15:54
  • @J... Assign to a WideString and it does indeed call SysAllocString. Or it might be SysReallocString but that's morally equivalent. – David Heffernan Feb 20 '12 at 22:17
  • @David: Delphi is no more "the odd one out" than any other language. Even different implementations of C don't agree on how this must be done. Returning non-POD types from a function is always a problem when different languages are involved. – Rudy Velthuis Feb 22 '12 at 13:27
  • @Rudy `BSTR` is a POD type. It's just a pointer. Which other language do you know that has this problem? – David Heffernan Feb 22 '12 at 13:42
  • @David: I already said it: C has no uniform way to return non-POD types. Sometimes like in Delphi, as reference parameter, sometimes in one or even two registers, sometimes on some kind of stack, etc. Other languages may also use one of these. And BSTR points to a type that also has a preceding length dword. That is why it can't be treated like a normal PWideChar, just like AnsiString or UnicodeString can't be treated like a normal P(Wide)Char, even if they actually are. – Rudy Velthuis Feb 22 '12 at 17:35
  • @rudy BSTR is a POD so your comments don't apply here. I can return a TBStr no probs. It's just a PWideChar for the sake of parameter passing. – David Heffernan Feb 22 '12 at 17:40
  • @DavidHeffernan, I think that WideString is not a POD because it involves the `SysAllocString` "magic". I still don't understand the explanation in the accepted answer. – kobik Feb 22 '12 at 18:25
  • @kobik That's akin to saying that `THandle` is not POD because you need to call a special function to make one. The issue, to the best of my understanding is a combination of automatic management of `BSTR` via the WideString compiler magic, and the semantics of return values being INOUT parameters. I have to say, it makes no sense to me that return values have IN semantics. – David Heffernan Feb 22 '12 at 19:18
  • 1
    @DavidHeffernan, so `procedure TestWideStringOutParam(var str: WideString); stdcall` (note the `var`) wont work? or am I still getting it wrong? (because it does work) – kobik Feb 22 '12 at 20:27
  • @kobik In that code you can have the same parameter semantics at both ends. Incidentally you called it `OUT` but it is in fact `INOUT`. With the function return value, all languages that I know, other than Delphi, treat the return value as an `OUT`. But Delphi treats it as `INOUT` and therefore assume that the BSTR pointer is valid on entry. – David Heffernan Feb 22 '12 at 20:34
  • @DavidHeffernan, thanks for clearing that out. still need to digest it though ;) I had the same conclusion only I was not able to explain why... – kobik Feb 22 '12 at 20:40
  • @David: A BSTR (and thus WideString) is not a POD, just like a UnicodeString (also in fact just a pointer) is not a POD. – Rudy Velthuis Feb 22 '12 at 21:43
  • @Rudy All C types are POD. Since `BSTR` is a C type, it is a POD. The issue is not related to the type. The issue is with the mismatch in the semantics of return values between languages. Sadly Delphi is the odd one out here. – David Heffernan Feb 22 '12 at 21:53
  • No, @David, Delphi is NOT the odd one out here. How items (even scalars) are returned even differs between implementations of C. – Rudy Velthuis Feb 23 '12 at 00:21
  • FWIW, @David, I'm sure it DOES work for BSTR. Just not for WideString. And being a COM-managed type means that it is not a simple POD type, no more than AnsiString, which you would not return from a DLL to be used in C code. – Rudy Velthuis Feb 23 '12 at 00:36
  • @rudy 1. Which C implementations on Windows differ in this respect? 2. BSTR is POD. WideString is described in documentation as being compatible with BSTR. In this regard it is not. – David Heffernan Feb 23 '12 at 07:15
  • @David: several. GNU C++, VC++, C++Builder, etc. differ in this respect, even for POD types. VC++ will return e.g. structs (even POD structs) in EAX:EDX or RDX when they are up to 64 bit in size, otherwise it takes the same approach as C++Builder. GNU takes AFAIK the same approach as C++Builder. But also scalar types like int64_t or float are handled differently. These are the types with which I had problems when converting. There are more. Fact is that the implementations don't always agree on how items are returned, and that returning anything but simple scalars (and BSTR is not!) is tricky. – Rudy Velthuis Feb 23 '12 at 17:37
  • @Rudy BSTR is a simple scalar. It's a pointer. The problem is that function result is an INOUT parameter in Delphi and an OUT parameter in all other tools that count. I'm repeating myself. I understand what you say in your latest comment, and I don't disagree. It's just that it is not pertinent. You are talking about which registers are used for parameter passing. That's not relevant here since there is no such mismatch in my example. – David Heffernan Feb 23 '12 at 17:42
  • A related question: http://stackoverflow.com/questions/3250827/initialise-string-function-result/ This is documented weird/non-intuitive behaviour of Delphi. – Alex Oct 18 '12 at 22:17

2 Answers2

26

In regular Delphi functions, the function return is actually a parameter passed by reference, even though syntactically it looks and feels like an 'out' parameter. You can test this out like so (this may be version dependent):

function DoNothing: IInterface;
begin
  if Assigned(Result) then
    ShowMessage('result assigned before invocation')
  else
    ShowMessage('result NOT assigned before invocation');
end;

procedure TestParameterPassingMechanismOfFunctions;
var
  X: IInterface;
begin
  X := TInterfaceObject.Create;
  X := DoNothing; 
end;

To demonstrate call TestParameterPassingMechanismOfFunctions()

Your code is failing because of a mismatch between Delphi and C++'s understanding of the calling convention in relation to the passing mechanism for function results. In C++ a function return acts like the syntax suggests: an out parameter. But for Delphi it is a var parameter.

To fix, try this:

function TestWideString: WideString; stdcall;
begin
  Pointer(Result) := nil;
  Result := 'TestWideString';
end;
Sebastian Zartner
  • 18,808
  • 10
  • 90
  • 132
Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65
  • 6
    That sounds plausible, but `Pointer(result) := nil` itself raises an AV. – David Heffernan Feb 19 '12 at 14:08
  • For functions, Delphi stores the pointer to the result in EAX. This pretty much explains it. From Delphi's point of view you cant pass in "no variable" as a var parameter. – Sean B. Durkin Feb 19 '12 at 14:27
  • I am not sure if you want just an explaination or a work-around. You probably dont need a work-around because, as you have already stated, the explicit out parameter mechanism works. – Sean B. Durkin Feb 19 '12 at 14:28
  • Explanation is just fine. Workarounds abound. I suppose I'm curious as to why `Pointer(Result) := nil` fails with AV. – David Heffernan Feb 19 '12 at 14:35
  • Pointer(Result) := nil fails with AV because the reference to "Result" is a dangling pointer. It is dangling because C++ doesnt see the point of setting it because it has a different model of the call. – Sean B. Durkin Feb 19 '12 at 15:06
  • @SeanB.Durkin But I should be able to assign `nil` to a dangling pointer. – David Heffernan Feb 19 '12 at 15:11
  • It is dangling because C++ is not passing a viable `WideString` instance to the Delphi `Result` parameter, that is why the Delphi code is failing. `WideString` is a built in Delphi type that **wraps** a `BSTR`. It is **not** a BSTR by itself. In C++, `WideString` is a class type. You need to change the C++ function declaration to return a `WideString` instead of a `BSTR`. That way, the C++ compiler generates different machine code to pass the `WideString` to Delphi, and the Delphi code receives a viable `WideString` to operate on. – Remy Lebeau Feb 19 '12 at 19:26
  • @Remy That assumes Emba C++ compiler. I'm using MS. Also would like to use C# and pinvoke with MarshalAs. But I don't buy what you say. Why does `Pointer(Result) := nil` throw AV? And if WideString is not binary compatible with `BSTR`, how can we do what we do with parameters. I guess I must be missing something. – David Heffernan Feb 19 '12 at 19:34
  • 6
    `Pointer(Result) := nil` throws an AV because the return type is actually a pointer to a WideString (hidden out paramerter). And by assigning it nil, the pointer (that was never handled over by C++) is deferenced: `mov eax,[ebp+$08]; xor edx,edx; mov [eax],edx`. In other words: WideString return values are always passed as hidden out parameters. Delphi doesn't allow you to change that behavior. – Andreas Hausladen Feb 19 '12 at 19:50
  • 7
    However, it may be possible to trick Delphi by returning a `PWideChar`: (untested) `function TestWideString: PWideChar; stdcall; var RealResult: WideString absolute Result; begin Initialize(RealResult); RealResult := 'TestWideString'; end;` –  Feb 19 '12 at 20:05
  • @AndreasHausladen Why is it no good with a *hidden* out but fine with a normal out? Is that the key? I have to say I am struggling to work out why this is failing. I mean, it doesn't really matter, but I do like to understand what my tools are doing. – David Heffernan Feb 19 '12 at 20:53
  • @DavidHeffernan That's because C++ *doesn't* use that hidden out. In C++, BSTR is a typedef for `unsigned short *` or `wchar_t *` (I'm not sure which), a pointer type without any special behaviour from the compiler, and pointer values are returned directly. –  Feb 19 '12 at 21:14
  • @hvd I think the definition of BSTR is a little irrelevant here. After all you can map `WideString` <--> `BSTR` for parameters. If you look at the Delphi implementation it's just a pointer to WideChar. The same applies to Remy's comment. The fact that C++ Builder wraps it in a class is surely beside the point. Presumably the only field of the class is a pointer to wchar_t and so binary compatible with `BSTR`. – David Heffernan Feb 19 '12 at 22:22
  • @DavidHeffernan That's just it: it's *not* "just a pointer to WideChar" in Delphi. There's a lot of compiler magic to automatically call special built-in functions that don't and shouldn't get called for PWideChar. And I expect wrapping `BSTR` in a class, and returning that class from a function, is *also* not binary compatible in C++ with returning a `BSTR` directly. –  Feb 19 '12 at 22:34
  • @hvd I don't buy that argument. Are you saying that because Delphi interfaces have lots of extra magic code around them (automatic reference counting) that they are not binary compatible with COM interfaces. If `WideString` wasn't binary compatible with `BSTR` then we wouldn't be able to declare COM methods as taking `WideString` parameters where the C declarations take `BSTR`. Clearly `WideString` and `BSTR` are binary compatible. Ultimately the **data** is just a pointer to wide char. The problem appears to be in parameter passing conventions not matching. – David Heffernan Feb 19 '12 at 22:43
  • 1
    @DavidHeffernan You're right that that part was a poor argument, but I stand by my conclusion. Both `WideString` and `BSTR` have the size of a pointer, but that doesn't mean they're always passed the same way. They are close enough so that they're passed the same way for procedure and function arguments, but if the `stdcall` calling convention returns structures via a hidden `out` parameter, and `WideString` is treated as a structure, then it won't be returned the same way as a `BSTR` (`PWideChar`). –  Feb 19 '12 at 22:55
19

In C#/C++ you will need to define the Result as out Parameter, in order to maintain binary code compatibility of stdcall calling conventions:

Returning Strings and Interface References From DLL Functions

In the stdcall calling convention, the function’s result is passed via the CPU’s EAX register. However, Visual C++ and Delphi generate different binary code for these routines.

Delphi code stays the same:

function TestWideString: WideString; stdcall;
begin
  Result := 'TestWideString';
end;

C# code:

// declaration
[DllImport(@"Test.dll")]        
static extern void  TestWideString([MarshalAs(UnmanagedType.BStr)] out string Result);
...
string s;
TestWideString(out s); 
MessageBox.Show(s);
kobik
  • 21,001
  • 4
  • 61
  • 121
  • 5
    +1 Yes that does it. I still cannot get my head around what's really going on here though!! – David Heffernan Feb 19 '12 at 22:19
  • Note that from my testing it seems that the Result parameter is always first in the list if you have multiple parameters, not last as might be assumed. – Jamie Kitson Oct 10 '13 at 14:20
  • @JamieKitson I don't understand that comment. If you mean the Delphi implict var parameter that is used to return the function return value, the extra parameter is passed after the others. It's documented clearly here: http://docwiki.embarcadero.com/RADStudio/en/Program_Control#Handling_Function_Results – David Heffernan Dec 10 '14 at 10:40
  • @DavidHeffernan Perhaps what Jamie observed is that the parameters are passed in reversed order with `stdcall`, as your link also states (last first). So the "result" parameter, which is the last on the declaration/Delphi side, is passed the first at stub/asm level. – Arnaud Bouchez Oct 22 '15 at 14:42