Optimization/vectorization of Matlab algorithm including very big matrices

Question

I have a optimization problem in Matlab. Assume I get the following three vectors as input:

A of size (1 x N) (time-series of signal amplitude)
F of size (1 x N) (time-series of signal instantaneous frequency)
fx of size (M x 1) (frequency axis that I want to match the above on)

Now, the elements of F might not (99% of the times they will not) necessarily match the items of fx exactly, which is why I have to match to the closest frequency.

Here's the catch: We are talking about big data. N can easily be up to 2 million, and this has to be run hundred times on several hundred subjects. My two concerns:

Time (main concern)
Memory (production will be run on machines with +16GB memory, but development is on a machine with only 8GB of memory)

I have these two working solutions. For the following, N=2604000 and M=201:

Method 1 (for-loop)

Simple for-loop. Memory is no problem at all, but it is time consuming. Easiest implementation.

tic;
I = zeros(M,N);
for i = 1:N
   [~,f] = min(abs(fx-F(i)));
   I(f,i) = A(i).^2;
end
toc;

Duration: 18.082 seconds.

Method 2 (vectorized)

The idea is to match the frequency axis with each instantaneous frequency, to get the id.

             F
        [ 0.9 0.2 2.3 1.4 ] N
   [ 0 ][  0   1   0   0  ]
fx [ 1 ][  1   0   0   1  ]
   [ 2 ][  0   0   1   0  ]
     M

And then multiply each column with the amplitude at that time.

tic;
m_Ff = repmat(F,M,1);
m_fF = repmat(fx,1,N);
[~,idx] = min(abs(m_Ff - m_fF));  clearvars m_Ff m_fF;
m_if = repmat(idx,M,1);           clearvars idx;
m_fi = repmat((1:M)',1,N);
I = double(m_if==m_fi);           clearvars m_if m_fi;
I = bsxfun(@times,I,A);
toc;

Duration: 64.223 seconds. This is surprising to me, but probably because the huge variable sizes and my limited memory forces it to store the variables as files. I have SSD, though.

The only thing I have not taken advantage of, is that the matrices will have many zero-elements. I will try and look into sparse matrices.

I need at least single precision for both the amplitudes and frequencies, but really I found that it takes a lot of time to convert from double to single.

Any suggestions on how to improve?

UPDATE

As of the suggestions, I am now down to a time of combined 2.53 seconds. This takes advantage of the fact that fx is monotonically increasing and even-spaced (always starting in 0). Here is the code:

tic; df = mode(diff(fx)); toc;       % Find fx step size
tic; idx = round(F./df+1); doc;      % Convert to bin ids
tic; I = zeros(M,N); toc;            % Pre-allocate output
tic; lin_idx = idx + (0:N-1)*M; toc; % Find indices to insert at
tic; I(lin_idx) = A.^2; toc;         % Insert

The timing outputs are the following:

Elapsed time is 0.000935 seconds.
Elapsed time is 0.021878 seconds.
Elapsed time is 0.175729 seconds.
Elapsed time is 0.018815 seconds.
Elapsed time is 2.294869 seconds.

Hence the most time-consuming step is now the very final one. Any advice on this is greatly appreciated. Thanks to @Peter and @Divakar for getting me this far.

UPDATE 2 (Solution)

Wuhuu. Using sparse(i,j,k) really improves the outcome;

tic; df = fx(2)-fx(1); toc;
tic; idx = round(F./df+1); toc;
tic; I = sparse(idx,1:N,A.^2); toc;

With timings:

Elapsed time is 0.000006 seconds.
Elapsed time is 0.016213 seconds.
Elapsed time is 0.114768 seconds.

If `df` is evenly spaced, then I think you can just do `df = fx(2) - fx(1)`. — Divakar, Mar 30 '16 at 19:17
Yes, but for some reason: >> tic; df = fx(2)-fx(1); toc; Elapsed time is 0.000963 seconds. >> tic; df = mode(diff(fx)); toc; Elapsed time is 0.000226 seconds. — casparjespersen, Mar 30 '16 at 19:18
Also, I think you can save some on the pre-allocation : `I(M,N)=0;I = reshape(A,[M,N]);`. It is based on two *hacks* : http://undocumentedmatlab.com/blog/preallocation-performance, http://stackoverflow.com/questions/36062574/why-is-reshape-so-fast — Divakar, Mar 30 '16 at 19:22
@Divakar Take a look at my latest update, I got it down to .12 seconds using `sparse`. — casparjespersen, Mar 30 '16 at 19:24

score 2 · Accepted Answer · answered Mar 30 '16 at 18:38

2

Here's one approach based on bsxfun -

abs_diff = abs(bsxfun(@minus,fx,F));
[~,idx] = min(abs_diff,[],1);

IOut = zeros(M,N);
lin_idx = idx + [0:N-1]*M;
IOut(lin_idx) = A.^2;

answered Mar 30 '16 at 18:38

Divakar

218,885
19
262
358

Interesting. That took 13 seconds, and the exact line timings are: 1. 11.43 sec 2. 1.36 sec 3. 0.10 sec 4. 0.07 sec 5. 3.48 sec Perhaps this combined with what @Peter suggests is the way to go. – casparjespersen Mar 30 '16 at 18:43
@c.jespersen Yeah, if `fx` is sorted, more optimizations are possible. @Peter's suggestions look legit too! – Divakar Mar 30 '16 at 18:47

score 2 · Answer 2 · answered Mar 30 '16 at 18:41

2

I'm not following entirely the relationship of F and fx, but it sounds like fx might be a set of bins of frequency, and you want to find the appropriate bin for each input F.

Optimizing this depends on the characteristics of fx.

If fx is monotonic and evenly spaced, then you don't need to search it at all. You just need to scale and offset F to align the scales, then round to get the bin number.

If fx is monotonic (sorted) but not evenly spaced, you want histc. This will use an efficient search on the edges of fx to find the correct bin. You probably need to transform f first so that it contains the edges of the bins rather than the centers.

If it's neither, then you should be able to at least sort it to get it monotonic, storing the sort order, and restoring the original order once you've found the correct "bin".

answered Mar 30 '16 at 18:41

Peter

14,559
35
55

Hi @Peter. `fx` is monotonic and evenly spaced (it is digitized), but `F` are instantaneous frequency measurements and continuous. Step size of `fx` is arbitrary, but I could compromise and limit step sizes so that `F` can be digitized with a `round(F,0.1)` i.e. – casparjespersen Mar 30 '16 at 18:42
Oh, good. You don't have to compromise. For any evenly spaced set of `fx` there's an `a` and `b` such that `round(a*F + b)` will directly give you the bin number you want. – Peter Mar 30 '16 at 18:45
Of course. Brilliant. Thanks. I'll look into combining that idea with what @Divakar suggested below. – casparjespersen Mar 30 '16 at 18:46

Optimization/vectorization of Matlab algorithm including very big matrices

2 Answers2