4

I've been tasked with getting some C# code working in x64 that calls a native x64 dll called Detagger that is used for converting HTML into Text while maintaining the basic stucture of the HTML.

This code has worked for years when running with platform target x86 for the C# code and an x86 build of the dll, but it's crashing when setting the platform target to x64 and using an x64 build of the dll. In fact, x64 works fine if the C# app is built with the .Net framework 3.5 or below. It crashes when built with 4.0 or above.

The dll in question has the following header:

#ifdef WIN32
    #ifdef USE_DLL
    #ifdef DLL_EXPORTS
        #define DLL_DECLARE __declspec(dllexport) long __stdcall
    #else
        #define DLL_DECLARE __declspec(dllimport) long __stdcall
    #endif
    #else
    #define DLL_DECLARE long
    #endif
#else
    #define DLL_DECLARE long
#endif

...

DLL_DECLARE CONVERTER_Allocate ();  // returns non-zero Handle if succeeds

...

DLL_DECLARE CONVERTER_ResetPolicies (long Handle);

And so the API requires calling the CONVERT_Allocate() function to get a "handle" (which I think is actually a memory address) and then passing that "handle" into all of the other methods. I presume this is for making the calls thread safe.

I'm trying to focus on the CONVERTER_ResetPolicies() function for now, because that is one of the most basic ones that takes just a single parameter (the "handle"). None of the functions in the entire API are complicated, all taking basic types or pointers to such as parameters (no structs).

From the C++ header, the calling convention is supposedly stdcall, and each of the exported functions in the dll returns a long (which should be 4 bytes in both x86 and x64). My understanding of x64 is that its calling convention is basically always a variant of fastcall, so I'm curious about the stdcall, but it works in .Net 3.5 and below so that's a question for another day.

The PInvoke signatures provided by the vendor for the dll are:

// DLL_DECLARE CONVERTER_Allocate();
[DllImport(_dll, EntryPoint = "CONVERTER_Allocate")]
public static extern IntPtr Allocate();

// DLL_DECLARE CONVERTER_ResetPolicies(long Handle);
[DllImport(_dll, EntryPoint = "CONVERTER_ResetPolicies")]
public static extern APIResult ResetPolicies(IntPtr handle);

Given the following C# code:

IntPtr handle = DetaggerAPI.Allocate();
var result = DetaggerAPI.ResetPolicies();

This crashes in the call to CONVERTER_ResetPolicies(). Stepping in the debugger reveals the following:

In C#: handle = 0x00000000e82d0080

In disassembly after stepping into the DLL:

registers and flags:

RAX = 000000018001B490 RBX = 0000000FCC66EB68 RCX = 00000000E82D0080
RDX = 0000000FCC66EC80 RSI = 0000000FCF8B44A8 RDI = 0000000FCC66E980 
R8  = 00001EB6102A86D4 R9  = 0000000FE84C4001 R10 = 00007FF9497961F0
R11 = 0000000000000000 R12 = 0000000000000000 R13 = 0000000FCC66EAF0
R14 = 0000000FCC66EB68 R15 = 0000000000000004 RIP = 000000018001B490 
RSP = 0000000FCC66E848 RBP = 0000000FCC66E850 EFL = 00000246 

CS = 0033 DS = 0000 ES = 0000 SS = 002B FS = 0000 GS = 0000 

OV = 0 UP = 0 EI = 1 PL = 0 ZR = 1 AC = 0 PE = 1 CY = 0 

Note that the value for handle is in RCX (e82d0080).

Here is the dissassembly (some comments added by me):

000000018001B490  sub         rsp,28h                   ; subtract 40 from stack pointer, sets up stack frame
000000018001B494  call        000000018001B090  

    000000018001B090  push        rbx  
    000000018001B092  sub         rsp,20h               ; subtract 32 from stack pointer, sets up stack frame
    000000018001B096  test        ecx,ecx               ; check if ecx is 0
    000000018001B098  movsxd      rbx,ecx               ; move value in ecx (the handle passed in) to rbx and sign-extend it to qword
                                                        ; rbx changes from 0000000FCC66EB68 to FFFFFFFFE82D0080
    000000018001B09B  je          000000018001B0C6      ; if ecx is 0, probably jump to a function that returns an error
->  000000018001B09D  cmp         dword ptr [rbx],4D2h  ; compare value pointed to by rbx (as a dword) to 042d (1234),
                                                        ; but rbx points to FFFFFFFFE82D0080, which is probably an invalid memory location,
                                                        ; so !!this is the line that crashes !!
    000000018001B0A3  jne         000000018001B0C6      ; jump if not equal

    000000018001B0A5  mov         ecx,dword ptr [1801122C0h]  
    000000018001B0AB  mov         dword ptr [rbx+2F0B0h],ecx  
    000000018001B0B1  lea         rcx,[rbx+2F0B8h]  
    000000018001B0B8  call        00000001800A7C40  
    000000018001B0BD  mov         rax,rbx  
    000000018001B0C0  add         rsp,20h  
    000000018001B0C4  pop         rbx  
    000000018001B0C5  ret  

000000018001B499  test        rax,rax  
000000018001B49C  jne         000000018001B4BC  
000000018001B49E  cmp         dword ptr [1801122C0h],eax  
000000018001B4A4  je          000000018001B4B2  
000000018001B4A6  lea         rcx,[1800D7B70h]  
000000018001B4AD  call        000000018001B290  
000000018001B4B2  mov         eax,2                     ; if we got here, return 2 in eax, meaning APIResult.Invalid.  Note that this is 32bits.
000000018001B4B7  add         rsp,28h                   ; clean up stack frame
000000018001B4BB  ret                                   ; return

So, looks like the "handle" is being passed in RCX, and then subsequently the

movsxd  rbx,ecx

instruction is copying this handle into RBX but also basically destroying it since it appears to be a memory address rather than just some opaque handle that is an array index or something similar. Then two instructions later I get an access violation from the instruction

cmp dword ptr [rbx],4D2h

because this is trying to dereference RBX, which points to garbage.

According to https://msdn.microsoft.com/en-us/library/ee941656(v=vs.100).aspx#core, under Platform Invoke, it says the difference between 3.5 SP1 and 4.0 is:

To improve performance in interoperability with unmanaged code, incorrect calling conventions in a platform invoke now cause the application to fail. In previous versions, the marshaling layer resolved these errors up the stack.

That is kind of vague, but since my only option here is stdcall (fastcall is not supported), I presume that is correct and not the issue.

Some things I'm going to try:

  1. Debugging running against .Net 3.5 and try to see what's different.
  2. Create a C++/cli wrapper for the dll instead of using PInvoke.

If anyone can spot what's going on here or give me any ideas, that'd be great.

domehead100
  • 161
  • 1
  • 7
  • have you looked at this question? http://stackoverflow.com/questions/10852634/using-a-32bit-or-64bit-dll-in-c-sharp-dllimport – NickD Sep 08 '15 at 14:41
  • have you looked at this question? http://stackoverflow.com/questions/10852634/using-a-32bit-or-64bit-dll-in-c-sharp-dllimport – NickD Sep 08 '15 at 14:43
  • http://stackoverflow.com/a/32436687/17034 – Hans Passant Sep 08 '15 at 16:00
  • Thanks @Hans. That makes sense, except that it's a native dll that is allocating this handle/pointer thing rather than .Net. We just pass it back what it gave us. It's compiled against the same runtime library regardless of the version of .Net that we're calling it from. It's clearly a problem with allocating a pointer that's above 4GB and a problem with the long datatype used in the dll for this "handle", but I'm still not understanding why it works at all in .Net 3.5 and below. – domehead100 Sep 13 '15 at 06:25

3 Answers3

2

As you mentioned, the assembly is clearly accessing the handle as a pointer. This means it is supposed to be a pointer, but since long on Windows is always 32-bit, it doesn't work.

It is probably a mistake, the C++ code shouldn't use long. It was probably a code that was written for linux, since long is 64-bit on linux (still a mistake to rely on compiler defined size).

I suggest that you replace the type of all the occurrence of handles by intptr_t (defined for linux and Windows in <cstdint>/<stdint.h>), to get the [probable] intended behavior. Actually, it is probably a good idea to replace all the long by intptr_t, since the mistake is probably everywhere.

EDIT: Since the code initially use a plain integer type, intptr_t is probably safer, but the ideal solution would be to use a typedef to void*, that would work everywhere and make more sense. If you see that using void* doesn't reveal any problem, use that instead (only for handles).

ElderBug
  • 5,926
  • 16
  • 25
1

If I'm interpreting the disassembly correctly then the x64 build of this DLL has a fatal flaw that is causing this issue. It appears to be trying to passing a 64 bit as pointer as a 32 bit singed integer (long).

That's based on the following analysis of the disassembly:

  1. You pass in the handle value e82d0080
  2. The DLL takes that handle and converts it to a 64 bit value
  3. The DLL then takes that 64 bit value and reads from that memory address.

It appears to be doing something to the following code:

DLL_DECLARE CONVERTER_ResetPolicies (long Handle) {
    int* ptr = (int*)Handle;
    if (*ptr == 0x4D2h) 
         ...
}

This code will fail as soon as Handle > 0x7FFFFFFF because of the sign extension in the conversion at the line movsxd rbx,ecx.

This code could work as long as Handle was allocated below 0x7FFFFFFF. That could explain why it works in .Net 3.5 but not 4.0 and why this code may have made it through testing. You could confirm this by looking at the value of Handle when running under 3.5.

This also reminds me of this blog post which explains that memory allocated changed between Windows 7 and 8 causes memory to be allocated above 4GB on Windows 8. So this could be another factor that would cause this code to only fail in certain environments.

shf301
  • 31,086
  • 2
  • 52
  • 86
  • We've tried to debug this on a Windows 8 desktop (one of my co-workers) and a Windows 10 desktop (me). I'm still not understanding exactly why .Net 3.5 and below works at all given that the dll is allocating this handle in native code, but everything you said makes sense. – domehead100 Sep 13 '15 at 06:29
0

The PInvoke signatures provided by the vendor look wrong: long is 4-bytes in x64 mode, but IntPtr is 8-bytes in x64 mode. I suggest changing them to UInt32.

// DLL_DECLARE CONVERTER_Allocate();
[DllImport(_dll, EntryPoint = "CONVERTER_Allocate")]
public static extern UInt32 Allocate();

// DLL_DECLARE CONVERTER_ResetPolicies(long Handle);
[DllImport(_dll, EntryPoint = "CONVERTER_ResetPolicies")]
public static extern APIResult ResetPolicies(UInt32 handle);

This probably should not have worked under .NET 3.5 either, and it is just working by luck. Also, I have no idea what APIResult is so I didn't look into that part.

Moby Disk
  • 3,761
  • 1
  • 19
  • 38