5

I have found a really tricky problem, which I can not seem to fix easily. In short, I would like to return from a mex file an array, which has been passed as mex function input. You could trivially do this:

void mexFunction(int nargout, mxArray *pargout [ ], int nargin, const mxArray *pargin[])
{
   pargout[0] = pargin[0];
}

But this is not what I need. I would like to get the raw pointer from pargin[0], process it internally, and return a freshly created mex array by setting the corresponding data pointer. Like that:

#include <mex.h>

void mexFunction(int nargout, mxArray *pargout [ ], int nargin, const mxArray *pargin[])
{
  mxArray *outp;
  double *data;
  int m, n;

  /* get input array */
  data = mxGetData(pargin[0]);
  m = mxGetM(pargin[0]);
  n = mxGetN(pargin[0]);

  /* copy pointer to output array */
  outp = mxCreateNumericMatrix(0,0,mxDOUBLE_CLASS,mxREAL);
  mxSetM(outp, m);
  mxSetN(outp, n);
  mxSetData(outp, data);
  /* segfaults with or without the below line */
  mexMakeMemoryPersistent(data);
  pargout[0] = outp;
}

It doesn't work. I get a segfault, if not immediately, then after a few calls. I believe nothing is said about such scenario in the documentation. The only requirement is hat the data pointer has been allocated using mxCalloc, which it obviously has. Hence, I would assume this code is legal.

I need to do this, because I am parsing a complicated MATLAB structure into my internal C data structures. I process the data, some of the data gets re-allocated, some doesn't. I would like to transparently return the output structure, without thinking when I have to simply copy an mxArray (first code snippet), and when I actually have to create it.

Please help!

EDIT

After further looking and discussing with Amro, it seems that even my first code snippet is unsupported and can cause MATLAB crashes in certain situations, e.g., when passing structure fields or cell elements to such mex function:

>> a.field = [1 2 3];
>> b = pargin_to_pargout(a.field);   % ok - works and assigns [1 2 3] to b
>> pargin_to_pargout(a.field);       % bad - segfault

It seems I will have to go down the 'undocumented MATLAB' road and use mxCreateSharedDataCopy and mxUnshareArray.

angainor
  • 11,760
  • 2
  • 36
  • 56
  • Why does `pargout[0] = pargin[0];` not do what you want? If that seems too unsupported, then does `mxCreateSharedDataCopy` do what you need by sharing the same data pointer, but handling reference counts so MATLAB won't end up crashing? – chappjc Nov 06 '13 at 18:21
  • an interesting approach using shared data. Would you be able to post the detailed solution as a new answer? – Shai Nov 07 '13 at 09:56
  • 1
    @Shai: [`typecastx`](http://www.mathworks.com/matlabcentral/fileexchange/17476-typecast-and-typecastx-c-mex-functions) submission by James Tursa is an excellent example of how to do this; it basically calls: `plhs[0] = mxCreateSharedDataCopy(prhs[0]);` instead of `mxDuplicateArray` I wrote in my answer – Amro Nov 07 '13 at 10:04
  • @Shai For `mxCreateSharedDataCopy`, [here is a thorough, but dated example](http://www.mk.tu-berlin.de/Members/Benjamin/mex_sharedArrays) and [here is a quick one from James Tursa](http://www.mathworks.com/matlabcentral/answers/77048). See also [this nice list](http://www.mathworks.com/matlabcentral/answers/79046-mex-api-wish-list) of (semi)undocumented MEX API functions. – chappjc Nov 08 '13 at 00:39
  • I think `pargin_to_pargout(a.field);` segfaults because `nargout` is 0 and `pargout[0]` is invalid. I added a discussion about `mxCreateSharedDataCopy` to an answer. Maybe it will be useful. – chappjc Nov 08 '13 at 02:57
  • @chappjc No, it is legal to do so: http://www.mathworks.se/help/matlab/matlab_external/c-c-source-mex-files.html. "Note: It is possible to return an output value even if nlhs = 0. This corresponds to returning the result in the ans variable." – angainor Nov 08 '13 at 07:09
  • You are right. That was just a guess since a similar usage and that kind of assignment isn't crashing my machine. Always checking inputs and outputs, I guess I forgot you could avoid that check. BTW, I know my answer doesn't necessarily help with your case with structs, but I thought I'd type up the tests I was doing, esp. since Shai asked. – chappjc Nov 08 '13 at 08:04

2 Answers2

9

You should use mxDuplicateArray, thats the documented way:

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    plhs[0] = mxDuplicateArray(prhs[0]);
}
Amro
  • 123,847
  • 25
  • 243
  • 454
  • 1
    I do not want to duplicate the array. I want to create a new array, which holds the same physical memory pointer as the first one. `mxDuplicateArray` will allocate new memory and copy the data. – angainor Nov 06 '13 at 13:52
  • 2
    that would be dangerous and could cause MATLAB to crash. there are undocumented ways to share data though, see this post: http://stackoverflow.com/a/18849127/97160 – Amro Nov 06 '13 at 13:54
  • It does crash. But does not really need to, if there would be some mex function that did that safely. Or, if matlab would internally, automatically compare the pointers, which apparently it does not do. Thanks for the post, it looks promising. – angainor Nov 06 '13 at 13:59
  • 3
    the reason it crashes is that the two `mxArray` structures point to the same data in memory, but MATLAB does not know that the two variables are "linked"; so later when one is freed, the other is left in an inconsistent state with a dangling pointer... Internally MATLAB tracks this kind of thing using undocumented flags in the `mxArray_tag` structure (think reference counting). This data sharing behavior is seen when you write `A = B;` for regular MATLAB matrices. See the links posted in the comments of that post for more info. – Amro Nov 06 '13 at 14:12
  • From your answer I get the impression that `plhs[0] = prhs[0]` is not legal either. Why did you use `mxDuplicateArray`? – angainor Nov 06 '13 at 15:59
  • 1
    actually `plhs[0] = prhs[0]` is legal, although this case is not explicitly documented. The output array is a separate `mxArray` that shares its data with the input array, all the while being correctly handled when one of them is changed/deleted in the MATLAB side unlike the `mxSetData` case. I dont usually recommend it unless you know what you are doing :) plus is it somewhat confusing since it breaks the constant-ness of the input `prhs`. See this discussion: http://www.mathworks.com/matlabcentral/answers/77048 – Amro Nov 06 '13 at 16:21
  • I guess it all boils down to the never ending discussion about clean in-place operations on matlab arrays in mex files.. Thanks for good references! – angainor Nov 06 '13 at 19:04
  • 1
    Curiously, I have found a reliable crash when using `plhs[0] = prhs[0]` in some cases. I have posted a reply to James Tursa answer at http://www.mathworks.com/matlabcentral/answers/77048. So it seems that even this is incorrect :( Absolutely no (legal) way around copying the memory there and back. – angainor Nov 06 '13 at 21:17
  • @angainor: I see, thanks for the correction. As James explained, there are several types of `mxArray` variables (indicated by a flag inside the `mxArray` structure): it can be a normal variable (for example when function is called as `func(x)` using an existing variable `x`), a sub-element (`func(x.a)` or `func(x{1})` using a struct field or a cell), a temporary (`func(x+y)` using a result of an expression), or one of few other types (global, persistent, etc..). Apparently those are treated differently in the case discussed above. – Amro Nov 07 '13 at 09:23
  • ... refer to the [`typecastx.c`](http://www.mathworks.com/matlabcentral/fileexchange/17476-typecast-and-typecastx-c-mex-functions) submission by James Tursa on FEX to see the latest description of the `mxArray` internal structure. I guess you'll either have to use the documented but inefficient way of creating a deep copy using `mxDuplicateArray`, or use the unsupported function `mxCreateSharedDataCopy` (MathWorks actually mentioned it in a [solution page](http://www.mathworks.com/support/solutions/en/data/1-6NU359/index.html)) – Amro Nov 07 '13 at 09:23
6

While undocumented, the MEX API function mxCreateSharedDataCopy iswas given as a solution by MathWorks, now apparently disavowed, for creating a shared-data copy of an mxArray. MathWorks even provides an example in their solution, mxsharedcopy.c.

As described in that removed MathWorks Solution (1-6NU359), the function can be used to clone the mxArray header. However, the difference between doing plhs[0] = prhs[0]; and plhs[0] = mxCreateSharedDataCopy(prhs[0]); is that the first version just copies the mxArray* (a pointer) and hence does not create a new mxArray container (at least not until the mexFunction returns and MATLAB works it's magic), which would increment the data's reference count in both mxArrays.

Why might this be a problem? If you use plhs[0] = prhs[0]; and make no further modification to plhs[0] before returning from mexFunction, all is well and you will have a shared data copy thanks to MATLAB. However, if after the above assignment you modify plhs[0] in the MEX function, the change be seen in prhs[0] as well since it refers to the same data buffer. On the other hand, when explicitly generating a shared copy (with mxCreateSharedDataCopy) there are two different mxArray objects and a change to one array's data will trigger a copy operation resulting in two completely independent arrays. Also, direct assignment can cause segmentation faults in some cases.

Modified MathWorks Example

Start with an example using a modified mxsharedcopy.c from the MathWorks solution referenced above. The first important step is to provide the prototype for the mxCreateSharedDataCopy function:

/* Add this declaration because it does not exist in the "mex.h" header */
extern mxArray *mxCreateSharedDataCopy(const mxArray *pr);

As the comment states, this is not in mex.h, so you have to declare this yourself.

The next part of the mxsharedcopy.c creates new mxArrays in the following ways:

  1. A deep copy via mxDuplicateArray:

    copy1 = mxDuplicateArray(prhs[0]);
    
  2. A shared copy via mxCreateSharedDataCopy:

    copy2 = mxCreateSharedDataCopy(copy1);
    
  3. Direct copy of the mxArray*, added by me:

    copy0 = prhs[0]; // OK, but don't modify copy0 inside mexFunction!
    

Then it prints the address of the data buffer (pr) for each mxArray and their first values. Here is the output of the modified mxsharedcopy(x) for x=ones(1e3);:

prhs[0] = 72145590, mxGetPr = 18F90060, value = 1.000000
copy0   = 72145590, mxGetPr = 18F90060, value = 1.000000
copy1   = 721BF120, mxGetPr = 19740060, value = 1.000000
copy2   = 721BD4B0, mxGetPr = 19740060, value = 1.000000

What happened:

  1. As expected, comparing prhs[0] and copy0 we have not created anything new except another pointer to the same mxArray.
  2. Comparing prhs[0] and copy1, notice that mxDuplicateArray created a new mxArray at address 721BF120, and copied the data into a new buffer at 19740060.
  3. copy2 has a different address (mxArray*) from copy1, meaning it is also a different mxArray not just the same one pointed to by different variables, but they both share the same data at address 19740060.

The question reduces to: Is it safe to return in plhs[0] either of copy0 or copy2 (from simple pointer copy or mxCreateSharedDataCopy, respectively) or is it necessary to use mxDuplicateArray, which actually copies the data? We can show that mxCreateSharedDataCopy would work by destroying copy1 and verifying that copy2 is still valid:

mxDestroyArray(copy1);
copy2val0 = *mxGetPr(copy2); % no crash!

Applying Shared-Data Copy to Input

Back to the question. Take this a step further than the MathWorks example and return a share-data copy of the input. Just do:

if (nlhs>0) plhs[0] = mxCreateSharedDataCopy(prhs[0]);

Hold your breath!

>> format debug
>> x=ones(1,2)
x =

Structure address = 9aff820     % mxArray*
m = 1
n = 2
pr = 2bcc8500                   % double*
pi = 0
     1     1
>> xDup = mxsharedcopy(x)
xDup =

Structure address = 9afe2b0     % mxArray* (different)
m = 1
n = 2
pr = 2bcc8500                   % double* (same)
pi = 0
     1     1
>> clear x
>> xDup % hold your breath!
xDup =

Structure address = 9afe2b0 
m = 1
n = 2
pr = 2bcc8500                    % double* (still same!)
pi = 0
     1     1

Now for a temporary input (without format debug):

>> tempDup = mxsharedcopy(2*ones(1e3));
>> tempDup(1)
ans =
     2

Interestingly, if I test without mxCreateSharedDataCopy (i.e. with just plhs[0] = prhs[0];), MATLAB doesn't crash but the output variable never materializes:

>> tempDup = mxsharedcopy(2*ones(1e3)) % no semi-colon
>> whos tempDup
>> tempDup(1)
Undefined function 'tempDup' for input arguments of type 'double'.

R2013b, Windows, 64-bit.

mxsharedcopy.cpp (modified C++ version):

#include "mex.h"

/* Add this declaration because it does not exist in the "mex.h" header */
extern "C" mxArray *mxCreateSharedDataCopy(const mxArray *pr);
bool mxUnshareArray(const mxArray *pr, const bool noDeepCopy); // true if not successful

void mexFunction(int nlhs,mxArray *plhs[],int nrhs,const mxArray *prhs[])
{
    mxArray *copy1(NULL), *copy2(NULL), *copy0(NULL);

    //(void) plhs; /* Unused parameter */

    /* Check for proper number of input and output arguments */
    if (nrhs != 1)
        mexErrMsgTxt("One input argument required.");
    if (nlhs > 1)
        mexErrMsgTxt("Too many output arguments.");

    copy0 = const_cast<mxArray*>(prhs[0]); // ADDED

    /* First make a regular deep copy of the input array */
    copy1 = mxDuplicateArray(prhs[0]);

    /* Then make a shared copy of the new array */
    copy2 = mxCreateSharedDataCopy(copy1);

    /* Print some information about the arrays */
    //     mexPrintf("Created shared data copy, and regular deep copy\n");
    mexPrintf("prhs[0] = %X, mxGetPr = %X, value = %lf\n",prhs[0],mxGetPr(prhs[0]),*mxGetPr(prhs[0]));
    mexPrintf("copy0   = %X, mxGetPr = %X, value = %lf\n",copy0,mxGetPr(copy0),*mxGetPr(copy0));
    mexPrintf("copy1   = %X, mxGetPr = %X, value = %lf\n",copy1,mxGetPr(copy1),*mxGetPr(copy1));
    mexPrintf("copy2   = %X, mxGetPr = %X, value = %lf\n",copy2,mxGetPr(copy2),*mxGetPr(copy2));

    /* TEST: Destroy the first copy */
    //mxDestroyArray(copy1);
    //copy1 = NULL;
    //mexPrintf("\nFreed copy1\n");
    /* RESULT: copy2 will still be valid */
    //mexPrintf("copy2 = %X, mxGetPr = %X, value = %lf\n",copy2,mxGetPr(copy2),*mxGetPr(copy2));

    if (nlhs>0) plhs[0] = mxCreateSharedDataCopy(prhs[0]);
    //if (nlhs>0) plhs[0] = const_cast<mxArray*>(prhs[0]);
}
chappjc
  • 30,359
  • 6
  • 75
  • 132
  • Hi chappjc, good answer - but sag news from me: I just played around with Matlab R2014A Pre - and found that this fine undocumented `mxCreateSharedDataCopy` and others are gone. So sad. – Bastian Ebeling Dec 17 '13 at 08:56
  • 2
    @BastianEbeling I see that in R2014a `mxCreateSharedDataCopy` is now exported as a decorated C++ function (mangling is compiler dependent); the undecorated name being `struct mxArray_tag * matrix::detail::noninlined::mx_array_api::mxCreateSharedDataCopy(struct mxArray_tag const *`. It seems MathWorks has done some code refactoring involving organizing the MX API functions into various namespaces. Anyway, it still works for me, although note that `mex` now has different configurations for C and C++. Do you get a linker error? – chappjc Dec 31 '13 at 00:35
  • Happy new year chappjc! I found that C++ mangled one, too - but for me there is no chance to run that function within plain c-mex. Yes: resulting in linker error. I do not want to rewrite my c-mex-code to c++-mex: Do you know, what I mean? – Bastian Ebeling Jan 02 '14 at 07:11
  • Maybe more details would be clearer: Compiling within windoze environment results in: `Building with 'Microsoft Windows SDK 7.1 (C)'. Error using mex Creating library double2byte.lib and object double2byte.exp double2byte.obj : error LNK2019: unresolved external symbol mxCreateSharedDataCopy referenced in function mexFunction double2byte.mexw64 : fatal error LNK1120: 1 unresolved externals` – Bastian Ebeling Jan 02 '14 at 12:57
  • @BastianEbeling: You had it working before R2014a? When I compiled in R2014a, I did not have to make any changes, using the same prototype in the post. However, you may need to run `mex` with the switch to indicate C++ code, no rewrite should be needed, but the linking should work. – chappjc Jan 02 '14 at 17:01
  • For anyone interested, a solution for R2014a is available [here](http://undocumentedmatlab.com/blog/serializing-deserializing-matlab-data/#MEX), courtesy of Bastian Ebeling – Yair Altman Jan 26 '14 at 07:47
  • Another update to all interested: The C functions that were removed in the pre-release of R2014a were [added back in the final release](http://undocumentedmatlab.com/blog/serializing-deserializing-matlab-data/#addendum). No need to use the more complicated C++ interfaces for these functions. – chappjc Jul 07 '14 at 18:43