1

Thanks in advance for any help.

I'm running a numerical optimization in MATLAB to fit the parameters of a time series model to data. To speed up running time I've vectorized as much code as I can but I'm left with the following for-loops that are order(s) of magnitude slower than other lines of code. As a result, the optimizations are impractically slow, especially as I want to compare the performance of many models. The operations involve repeated outer vector products or matrix multiplications on slices of an n-d array (where n is greater than 2).

I've vectorized the below for-loops in various ways such as by reshaping matrices (and replicating dimensions using repmat or bsxfun a la Matrix multiplication of row and column without for loop in matlab) but the vectorized versions run slower than the original for-loops. I've also downloaded several functions including mtimesx (which causes MATLAB to crash) and multiprod (which is slower than the below for-loops).

The first for-loop is:

Vnew = zeros(nS,nS,nC);
for c = 1:nC
    Vnew(:,:,c) = (eye(nS) - K(:,c)*C(c,:))*Vpred;
end

where nC = 20, nS = 40, K is a 40x20 matrix, C is a 20x40 matrix and Vpred is a 40x40 matrix.

The second for-loop is:

Vmerge = zeros(nS,nS);
for c = 1:nC
    Vmerge = Vmerge + (V(:,:,c,t) + err(:,c)*err(:,c)')*cPnew(c);
end

where V(:,:,:,t) is a 40x40x20 matrix, err is a 40x20 matrix and cPnew is a 20x1 vector.

Each of these for-loops is called 540 times per run of the optimizer (I have 540 discrete time points in my time series model).

I've been struggling with this for some time. Solutions I've found online don't seem to be faster than the above for-loops. If anybody can suggest a solution with running time comparisons I'd be so grateful.

Thanks,

Abby

Community
  • 1
  • 1
Jabby
  • 43
  • 1
  • 7
  • hard to help without data and full running code – Mendi Barel May 02 '17 at 08:10
  • Initialise a variable to `eye(nS)` *outside* of the loop as it doesn't change, and you could just compute `diff` once per loop and save a transpose by using the fact `diff(:,c)*diff(:,c)'=sum(diff(:,c).^2)` Otherwise I second what Mendi has said – Wolfie May 02 '17 at 08:28
  • All suggestions we can offer like `bsxfun`, initialize matrices with random junk instead of zeros, rename `diff` to avoid shadowing built-in functions etc... won't help much if you use newer matlab. It is a simple problem of having too much to calculate. It would help to parallelize this thingy if you don't already do that in some outer loop. And a side point - do you perhaps want `.'` instead of `'`? This is a common error. – Zizy Archer May 02 '17 at 08:42
  • Thanks for your responses. Taking eye(nS) out of the loop makes it a little bit faster. diff(:,c)*diff(:,c)' is an outer product not an inner product @Wolfie so I'm not sure that your suggestion is relevant. Using .' instead of ' makes no difference as far as I can tell Zizy Archer ... – Jabby May 02 '17 at 09:07
  • @AbbyJ Then you have real values. `.'` transposes and is the one you should use when you want transpose. `'` does complex conjugate transpose and often ends up being source of errors - code works for real input and breaks for complex :) – Zizy Archer May 02 '17 at 09:25
  • You shouldn't name your variable `diff` as this is an in built function. Not only is that confusing to read, but it's confusing for the compiler so will slow things a bit. My suggestion to use `sum(diff(:,c).^2)` is because it gives the same value as `diff(:,c)*diff(:,c)'` without the transpose. When I suggested it, I also thought it would save a call to the `diff` function, as I said that variable name is confusing! – Wolfie May 02 '17 at 09:48
  • I've changed the name of diff to err now, apologies for the confusion. – Jabby May 02 '17 at 10:09

1 Answers1

0

You can convert matlab code for the sample function Func(x,y) into mex-file:

codegen -config coder.config('lib') Func -args {x, y}

or lib/dll C code:

codegen -config coder.config('mex') Func -args {x, y}

Then you can call mex files just as ordinary matlab functions "Func_mex(x,y)" (they will run much faster) OR browse into C-code and see, what does happen inside your function at C level.

Sairus
  • 384
  • 1
  • 15
  • I don't suppose you'd be kind enough to convert function 'Predict_and_Filter_No_Loop' and 'Generalised_Pseudo_Bayesian_1' to mex files for me could you @Sairus? It seems codegen requires a license for HL coder or MATLAB coder which I either have to pay for or wait to be contacted for a free trial... – Jabby May 02 '17 at 09:34
  • If you upload sample data + functions + example script, I can convert your functions to work with sample data size (for other data sizes you should specify size limits). – Sairus May 02 '17 at 12:59
  • Thank you! The data (ObligData6FlipMPE,ChannelCue_I,ChannelCue_O) + functions (Generalised_Pseudo_Bayesian_1,Predict_and_Filter_No_Loop,Switching_State_Space_Model) + script (StackExchangeCode) can be found here: https://www.dropbox.com/sh/fs90dkhrfayglbx/AACXk1inTlseJ6Dl7ShQZ2zDa?dl=0 – Jabby May 02 '17 at 14:38
  • OK, I will check this in 6 hours. – Sairus May 02 '17 at 15:06
  • Amazing. Thank you! - or maybe you could even send me the codegen function? – Jabby May 02 '17 at 15:35
  • I think it wouldn't work due to licence limitations and dependancies. – Sairus May 02 '17 at 15:58
  • ok no worries. how would you like to send the converted files to me? – Jabby May 02 '17 at 16:00
  • I will post a link here. – Sairus May 02 '17 at 16:21
  • https://drive.google.com/file/d/0B-K62Cw5kQFLTmZhcTFKOU1wX1k/view?usp=sharing Could not convert first function because of an error at `xnew = xpred+K.*e';` - trying to add 20x1 and 20x10 matrices. Also, `Cue` is not defined, I've tried 1,2,3 values for it. And I don't understand, why you initialize and load variables at every optimization iteration? Try using `persistent x; if isempty(x) ... ` pattern and global variables loaded once. – Sairus May 03 '17 at 00:03
  • Thank you so much @Sairus. That is so kind of you! : ) In matlab 2017 `'xnew = xpred+K.*e'` is the same as `xnew = bsxfun(@plus,xpred,K.*e')`, i.e. the singleton dimension is implicitly expanded. Is there any chance you could change this line using bsxfun and convert the file? Apologies that Cue was not included in the code I sent - you're right, it is initialized to 1, 2 or 3 at the beginning of the script. I see what you mean about loading ChannelCue_I/ChannelCue_O at every iteration of the optimization - this is unnecessary. I've never used persistent before. Thanks! – Jabby May 03 '17 at 07:17
  • Does the mex file improve running time for you @Sairus? It won't run for me. I have changed my call from `[cP(:,t),xmerge(:,t),Vmerge(:,:,t),ypredmerge(CueT(t),t)] = Generalised_Pseudo_Bayesian_1(C,P,nC,nS,CueT,cL,t,xfilt,xpred,Vfilt)` to `[cP(:,t),xmerge(:,t),Vmerge(:,:,t),ypredmerge(CueT(t),t)] = Generalised_Pseudo_Bayesian_1_mex(C,P,nC,nS,CueT,cL,t,xfilt,xpred,Vfilt)` but I get an error message saying 'Undefined function or variable 'Generalised_Pseudo_Bayesian_1_mex''. `which Generalised_Pseudo_Bayesian_1_mex` = 'not found', whereas `which Generalised_Pseudo_Bayesian_1_mex.mexw64' = found. – Jabby May 03 '17 at 08:26
  • Maybe you can find somewhere running full version of matlab and check it there? I didn't check time, since it took too long on my PC, but it worked with *_mex function with no errors. – Sairus May 03 '17 at 10:38