I've got an external C library with a function
extern ushort* getSomeStringW(ushort* Dest, int MaxLen);
The function changes the Dest
buffer, and returns it.
I am wrapping it with ctypes. Since the library always returns 16bit encoded strings I have to explicitly use c_ushort
or c_uint16
buffers, and can't just create c_wchar_t
buffers using ctypes create_unicode_buffer
method. (c_wchar_t
is 16bit on Win, but 32bit on OSX)
Therefore I call the library like this:
dest = (ctypes.c_uint16 * MaxLen)()
result = library.getSomeStringW(byref(dest), MaxLen)
This works as expected. Interestingly ctypes seems to guess the return type, and result is a python string. I can also access the contents in the dest
buffer afterwards, and explicitly decode it as utf-16.
Now I thought about specifying the restype
of the function nontheless, but weirdly enough the results are wrong after that.
If, instead of the above code, I use
dest = (ctypes.c_uint16 * MaxLen)()
library.getSomeStringW.restype = ctypes.c_uint16 * MaxLen
result = library.getSomeStringW(byref(dest), MaxLen)
The result
array looks different from the dest
array. I know that the arrays probably don't reference the same memory location (at least that's what I make of the ctypes documentation) but I still don't get why the result ctypes object is not filled correctly.
It works when I use the numpy ctypelib and declare the restype
as
import numpy as np
library.getSomeStringW.restype = np.ctypeslib.ndpointer(ctypes.c_uint16, shape=(MaxLen,))
... which confuses me even more.
In the case where it does not work, the dest
array looks like I would expect
0000 = {int} 52
0001 = {int} 46
0002 = {int} 49
0003 = {int} 48
0004 = {int} 46
0005 = {int} 50
0006 = {int} 46
0007 = {int} 50
0008 = {int} 56
0009 = {int} 56
0010 = {int} 53
0011 = {int} 0
0012 = {int} 0
...
1000 = {int} 0
while the result
array looks something like
0000 = {int} 53920
0001 = {int} 25850
0002 = {int} 321
0003 = {int} 0
0004 = {int} 63968
0005 = {int} 35350
0006 = {int} 32764
0007 = {int} 0
0008 = {int} 52816
0009 = {int} 47565
0010 = {int} 32764
0011 = {int} 0
0012 = {int} 39040
0013 = {int} 26425
0014 = {int} 321
0015 = {int} 0
...
Every 4th integer seems to be 0, the values are all over the place (and obviously not sensible utf-16 character values)
Thanks for any ideas you might have!