6
  • z - matrix of doubles, size Nx2;
  • x - matrix of doubles, size Nx2;

sup = x(i, :);

phi(1, i) = {@(z) exp(-g * sum((z - sup(ones([size(z, 1) 1]),:)) .^ 2, 2))};

this is a Radial Basis Function (RBF) for logistic regression. Here is the formula:

enter image description here

I need your advice, can i optimize this formula? coz it calls millions times, and it takes a lot of time...

Amro
  • 123,847
  • 25
  • 243
  • 454
Yekver
  • 4,985
  • 7
  • 32
  • 49
  • You're missing a comma in the expression, in the `repmat` call. – reve_etrange Aug 08 '11 at 23:11
  • 1
    @Yekver Your formula has syntax errors. I've marked them with a `$`. `x(i,:)$(ones([size(z,1) 1]),$:))`. It also helps if you tell us how big `N` is. – abcd Aug 08 '11 at 23:20
  • @yoda This formula forks fine for me. I use additional variable `sup = x(i, :);`. N can be `[100..5000]` – Yekver Aug 08 '11 at 23:34
  • @yoda: Actually, the dimensions do match now (it will really be `Nx2` since sup is), but i agree a formula would be clearer. – Mikael Öhman Aug 08 '11 at 23:56
  • @MikaelÖhman you're right about `sup`. I misread it as `x(:,i)`. I went back and looked at his first version and that was the clearest. I wonder why he kept trying to modify the expression. – abcd Aug 08 '11 at 23:59
  • @yoda I trying to modify the expression because it takes too much time to evaluate it. logistic regression with linear basis func works less than a second for 500 objects, instead of rbf, that takes more than 1 sec for each object! – Yekver Aug 09 '11 at 00:13
  • @Yekver is `x_j` a component of `x` or something totally different? – abcd Aug 09 '11 at 01:25

2 Answers2

6

It seems in your recent edits, you introduced some syntax errors, but I think I understood what you were trying to do (from the first version).

Instead of using REPMAT or indexing to repeat the vector x(i,:) to match the rows of z, consider using the efficient BSXFUN function:

rbf(:,i) = exp( -g .* sum(bsxfun(@minus,z,x(i,:)).^2,2) );

The above obviously loops over every row of x


You can go one step further, and use the PDIST2 to compute the euclidean distance between every pair of rows in z and x:

%# some random data
X = rand(10,2);
Z = rand(10,2);
g = 0.5;

%# one-line solution
rbf = exp(-g .* pdist2(Z,X,'euclidean').^2);

Now every value in the matrix: rbf(i,j) corresponds to the function value between z(i,:) and x(j,:)


EDIT:

I timed the different methods, here is the code I used:

%# some random data
N = 5000;
X = rand(N,2);
Z = rand(N,2);
g = 0.5;

%# PDIST2
tic
rbf1 = exp(-g .* pdist2(Z,X,'euclidean').^2);
toc

%# BSXFUN+loop
tic
rbf2 = zeros(N,N);
for j=1:N
    rbf2(:,j) = exp( -g .* sum(bsxfun(@minus,Z,X(j,:)).^2,2) );
end
toc

%# REPMAT+loop
tic
rbf3 = zeros(N,N);
for j=1:N
    rbf3(:,j) = exp( -g .* sum((Z-repmat(X(j,:),[N 1])).^2,2) );
end
toc

%# check if results are equal
all( abs(rbf1(:)-rbf2(:)) < 1e-15 )
all( abs(rbf2(:)-rbf3(:)) < 1e-15 )

The results:

Elapsed time is 2.108313 seconds.     # PDIST2
Elapsed time is 1.975865 seconds.     # BSXFUN
Elapsed time is 2.706201 seconds.     # REPMAT
Amro
  • 123,847
  • 25
  • 243
  • 454
  • @Yekver: I still don't understand what you are trying to do; posting snippets of code with little explanation is bound to confuse readers... please edit your question and describe the problem more clearly – Amro Aug 09 '11 at 01:08
  • Amro, any particular reason that complete bsxfun wasn't tested ? It is a bit tricky but is achievable. – Pavan Yalamanchili Aug 09 '11 at 03:44
  • 1
    @Yekver, In your implementation each cell will contain N elements, and since there are 1xN cell arrays, the output Amro is getting is the same as yours, except he is using a matrix representation instead of a cell. – Pavan Yalamanchili Aug 09 '11 at 03:58
3

Amro has mentioned some really good methods. But the bsxfun can be further exploited by reshaping one of the matrices.

>> type r.m

N = 5000;
X = rand(N,2);
Z = rand(N,2);
g = 0.5;

%BSXFUN+loop
tic
rbf2 = zeros(N,N);
for j=1:N
    rbf2(:,j) = exp( -g .* sum(bsxfun(@minus,Z,X(j,:)).^2,2) );
end
toc

tic
diffs = bsxfun(@minus, reshape(X', [1, 2, N]), Z);
dist = reshape(sum(diffs.^2, 2), [N, N]);
rbf3 = exp(-g .* dist);
toc

>> r
Elapsed time is 2.235527 seconds.
Elapsed time is 0.877833 seconds.
>> r
Elapsed time is 2.253943 seconds.
Elapsed time is 1.047295 seconds.
>> r
Elapsed time is 2.234132 seconds.
Elapsed time is 0.856302 seconds.
>> max(abs(rbf2(:) - rbf3(:)))

ans =

     0

You want to subtract every row of X from every row of Z. This usually is straight forward when one of them is a vector and the other is a matrix. But if both of them are matrices, we can do this by making sure that each matrix in the volume contains just one vector. Here I chose X, but Z can be used interchangeably with X.

Pavan Yalamanchili
  • 12,021
  • 2
  • 35
  • 55
  • +1 nice vectorization of BSXFUN. I should mention that the way we compare the timings is a bit unfair. In the case of PDIST2 there is an overhead of calling a function, which doesn't exist for the other inline methods. So I wrapped each in a function and re-did the experiments using [TIMEIT](http://www.mathworks.com/matlabcentral/fileexchange/18798), turns out PDIST2 is the fastest: `1.4369 (PDIST2), 1.816 (BSXFUN+loop), 2.2238 (REPMAT+loop), 2.2306 (BXSFUN/RESHAPE)` – Amro Aug 09 '11 at 13:28