1

I have a MATLAB code which is

%% Inputs are theta and h (size NxM)
alpha=zeros(N,M);
h_tmp=zeros(N,M);
h_tmp(1:N-1,:)=h(2:N ,:);
for i=1:N
    alpha(i,:)=theta.*(h_tmp(i,:)+h(i,:));
end

By using vectorized method, the above code can be

alpha = theta .* [h(1:N-1,:) + h(2:N,:); h(N,:)];

To speed up the code, I want to rewrite it in MEX file using C++. The main different between MATLAB and C++ in 2D array is row-major order (MATLAB) and column-major order(C++)

double  *h, *alpha, *h_temp;
int N,M;
double theta;    
N      = (int) mxGetN(prhs[0]); //cols
M      = (int) mxGetM(prhs[0]); //rows
h      = (double *)mxGetData(prhs[0]);
theta  = (double)*mxGetPr(prhs[1]);
/* Initial zeros matrix*/
plhs[0]   = mxCreateDoubleMatrix(M, N, mxREAL);  alpha = mxGetPr(plhs[0]);
//////////////Compute alpha/////////    
for (int rows=0; rows < M; rows++) {
    //h[N*rows+cols] is h_tmp
    for (int cols=0; cols < N; cols++) {        
         alpha[N*rows+cols]=theta*(h[N*rows+cols+1]+h[N*rows+cols]);
    }
}

Are my Mex code and MATLAB code equivalent? If not, could you help me to fix it?

Jame
  • 3,746
  • 6
  • 52
  • 101
  • 3
    It's not `rows+rows*N`, is it? You have to have a column loop and multiply the number of rows with the column index, if I understand your code correctly. It should be something like `alpha[N*rows+col]` where `col` is a counter for a second, inner loop... – kkuilla Mar 29 '16 at 08:41
  • How about represent h and h_tmp? Is it correct. I will correct it now and let check again – Jame Mar 29 '16 at 09:06
  • Further the condition in the second for-loop MUST NOT be `rows < N`, or you will loop inifinitly (as neither rows nor N change during execution). Additionally have a look at [this link to the mathworks forums](http://de.mathworks.com/matlabcentral/answers/95958-which-matlab-functions-benefit-from-multithreaded-computation), I think what you try to do will be slower in C++, as Matlab might already be multithreading your code. –  Mar 29 '16 at 09:34
  • Be aware that MATLAB vectorization tools are already in C. You most probably wont speed up a for loop with 3 basic math operations in mex files. It will be , at most, the same speed – Ander Biguri Mar 29 '16 at 09:39
  • @FirefoxMetzger: It was typo. I corrected it. Ander Biguri: Because the above method is called many time (~6000 times). Thus, a small improved speed will be useful – Jame Mar 29 '16 at 11:14
  • @user8430 what I am saying is: it will be useful, but you may not be able to improve it. If you are doing more complicated stuff, you may increase the speed, but I doubt the speed will be increased for these operations. MATrix LABoratory is quite fast when dealing with vectors/matrices – Ander Biguri Mar 29 '16 at 11:35
  • please don't double-post the same question: http://stackoverflow.com/q/36249260/97160. I've marked that one as duplicate on this since there's more discussion here... – Amro Mar 29 '16 at 14:17
  • @Amro: The vectorizied of alpha can be h_tmp=zeros(N,M); h_tmp(1:N-1,:)=h(2:N,:); alpha=1+theta.*(h_tmp+h);. It is more faster than your way – Jame Mar 30 '16 at 00:56

1 Answers1

1

Besides the corrections from the comments to your question, there is one minor difference. What is missing is that you skip h(N,:) in the Matlab code, where in the C code iteration over the code is done until cols < N, which (due to 0 indexing in C) will also process the last element in each column.

#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
    double  *h, *alpha, *h_temp;
    int num_columns, num_rows;
    double theta;    
    num_columns = (int) mxGetN(prhs[0]); //cols
    num_rows    = (int) mxGetM(prhs[0]); //rows
    h           = (double *)mxGetData(prhs[0]);
    theta       = (double)*mxGetPr(prhs[1]);
    /* Initial zeros matrix*/
    plhs[0]   = mxCreateDoubleMatrix(num_rows, num_columns, mxREAL);  alpha = mxGetPr(plhs[0]);
    //////////////Compute alpha/////////
    // there are num_rows many elements in each column
    // and num_columns many rows. Matlab stores column first.
    // h[0] ... h[num_rows-1] == h(:,1)
    int idx; // to help make code cleaner
    for (int column_idx=0; column_idx < num_columns; column_idx++) {
        //iterate over each column
        for (int row_idx=0; row_idx < num_rows-1; row_idx++) {// exclude h(end,row_idx)
            //for each row in a column do
            idx = num_columns * column_idx + row_idx;
            alpha[idx]= theta * (h[idx+1] + h[idx]);
        }
    }
    //the last column wasn't modified and set to 0 upon initialization.
    //set it now
    for(int rows = 0; rows < num_rows; rows++) {
        alpha[num_columns*rows+(num_rows-1)] = theta * h[num_columns*rows+(num_rows-1)];
    }
}

Note that I decided to rename some of the variables, so that I think it becomes easier to read.

Edit: Removed the suggestion with prhs[0] = plhs[0], as suggested in the comments to this answer. One may get away with this under certain circumstances, but in general it is no good practice when coding matlab .mex functions and it may crash Matlab.

  • 1
    @user8430 that is intended. I am not allocating any space for `plhs` nor am I creating a pointer named `alpha`. I am overwriting what ever was stored in `h` while iterating over `h`. This way it saves allocation of another matrix and with that computation time. Further `h` and `alpha` will point to the same chunk of memory. This may or may not be suitable for you application, but is one way to optimize past capabilities of the matlab compiler. –  Mar 29 '16 at 13:41
  • So, Are you sure that h(2:N,:) equivalent with h[N*rows+cols+1]? – Jame Mar 29 '16 at 13:41
  • @user8430 No, they are not equivalent. `h[N*rows+cols+1]` = h(cols+1,rows) is the double stored 1 memory block (64 bit) behind `h[N*rows+cols]` = h(cols,rows). This is what you want for all elements besides the elements located at h(N,:) = `h(N*rows+N)`, because h(N+1,:) does not make sense. In C however it is still valid syntax but, as described in my answer, doesn't give the desired outcome. –  Mar 29 '16 at 13:49
  • And the reason to use last unmodified h(N,:) to maintain the size NxM of alpha. if I use as your code, it has only N-1xM – Jame Mar 29 '16 at 13:49
  • In original code, the MATLAB code is alpha(i,:)=theta.*(h_tmp(i,:)+h(i,:)); , inwhich h_tmp(1:N-1,:)=h(2:N ,:); Hence, I think the C++ code need to be consistency – Jame Mar 29 '16 at 13:52
  • @user8430 yes, thats why I said you will likely have to re-factor your code to get it to work, which again may or may not be possible given your task. instead you can see if `alpha = theta * (2*h(1:N-1,:) + [diff(h); zeros(1,M)]);` performs any better, but there you still have the 'problem' of concatenating matricies –  Mar 29 '16 at 13:53
  • So, I cannot accept you answer because it did not solve my issue. – Jame Mar 29 '16 at 13:54
  • 3
    Setting plhs[0] to be equal to prhs[0] is not a good idea. Not only are you modifying the input variable (which screws up MATLAB's copy-on-write optimizations), you're telling MATLAB that it has a new output variable to manage, but really it's exactly the same memory as the input. I would expect a crash when clearing or overwriting both of those variables. – Peter Mar 29 '16 at 13:59
  • @Peter does Matlab actually care about what happens within a .mex file? I mean it can't do any of its matlab magic optimizations there, so I would expect it to just hand the code to a C compiler and run what ever it returns. You do have a point with clearing the variables, will try that later. –  Mar 29 '16 at 14:03
  • @FirefoxMetzger: I tested and saw that alpha = theta * [h(1:N-1,:) + diff(h); zeros(1,M)]; or alpha = theta * (2*h(1:N-1,:) + [diff(h); zeros(1,M)]) are wrong. – Jame Mar 29 '16 at 14:13
  • 1
    It *does* care about the state of the stuff it passes to and receives from the mex file. That's what the defined interface is all about. Try the clearing, and also try making a simple copy of an input variable before calling a mex function that modifies an input. I.e. `h2 = h; evil_mex_function(h); disp(h); disp(h2);` Spoiler: h2 will have the same changes you made to h. See for example: http://stackoverflow.com/q/9845097/1401351 – Peter Mar 29 '16 at 14:21
  • Peter is correct, that is dangerous and can easily crash MATLAB! See http://stackoverflow.com/a/19813844/97160 (read the comments). – Amro Mar 29 '16 at 14:34
  • @user8430 Try and see what linear indices you get then. Obtain the indices for e.g. `alpha(i,:)` using [`sub2ind`](https://uk.mathworks.com/help/matlab/ref/sub2ind.html). `alpha(1)` in Matlab should then be equal to `alpha[0]` in the mex file etc. That should give you hints as to where you go wrong and you can work yourself through it to make sure you are multiplying the correct numbers. Please remember that Matlab indexing is column-major; Down and then across... – kkuilla Mar 30 '16 at 08:23
  • 1
    @user8430 yes, the last row is off then. I thought you were just interested in the dimension, not in the actual content, to fix that you could do `[2*h(1:end-1,:)+diff(h);h(end,:)]`. Anyways, I've updated my answer and removed any further suggestions. Additionally I put a working cpp example of what you want to do. I've tested it on various matricies in R2015a and got the exact same result for the matlab code you provided, my cpp code and the line in this comment. If this still isn't what you are looking for, somebody else might be able to help. –  Mar 30 '16 at 13:31
  • Thank FirefoxMetzger. I will check it. However, the last rows of alpha(MATLAB) equal theta*h(N,:). It is only depends on h. – Jame Mar 30 '16 at 14:01
  • @user8430 fixed, I did forget a theta in the last for loop. –  Mar 31 '16 at 08:15