1

I have a function that performs the HodgesLehmann robust mean over a vector x[m,n]. n is the batch index of data, m is the number of samples.

function HLe = HodgesLehmann(x)

% Obtain dimensions
[m,n] = size(x);

% Create xi and xj values with the i <= j restriction enforced
q = logical(triu(ones(m,m),0));
i = uint32((1:m)'*ones(1,m));
xi = x(i(q),:);
j = uint32(ones(m,1)*(1:m));
xj = x(j(q),:);

% Calculate pairwise means (Walsh averages)
W = (xi+xj)./2;

% Calculate ordinary median of Walsh averages
HLe = median(W);

I am looking for a way to accelerate this function, it does not scale well for large dimensions of x. Any way of accelerating this is also welcome.

Many thanks.

user2324712
  • 435
  • 3
  • 13
  • Minor suggestion: instead of doing e.g. `uint32((1:m)'*ones(1,m))`, you can [create the array directly as uint32 type](http://se.mathworks.com/help/matlab/ref/ones.html#inputarg_classname) and skip the conversion from floating point, for example `ones(1,m,'uint32')`. – mikkola Nov 14 '15 at 07:13
  • I tried you suggestion but It is either not working or I don't completely understand your suggestion. The problem is that matlab is complaining that matrix multiplication (MTIMES) is not fully supported for integer classes. – user2324712 Nov 14 '15 at 08:32
  • Sorry, forgot about the MTIMES support for integer classes. You can disregard that suggestion. – mikkola Nov 14 '15 at 08:48

1 Answers1

1

Inspired by this solution, here's a possible (not tested for performance) improvement -

%// Calculate pairwise means (Walsh averages)
[I,J] = find(bsxfun(@le,[1:m]',[1:m])); %//'
W = (x(J,:) + x(I,:))./2;

%// Calculate ordinary median of Walsh averages
HLe = median(W);
Community
  • 1
  • 1
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • This improves performance, but very slightly. Is there any other way you would recommend? – user2324712 Nov 14 '15 at 07:08
  • @user2324712 Yeah, I just tested and doesn't seem to offer much of an improvement, sorta unexpected for me really. – Divakar Nov 14 '15 at 07:09
  • @user2324712 Let me ask you - How much of system RAM are you working with and what's the datasize of `x`? – Divakar Nov 14 '15 at 07:10
  • on my laptop, I currently have about 8 gigs. x has usual size of m ~ many thousands and n ~ 50,000. I cannot currently run this at full sizes and resort to batching this via a for loop. But ultimately this is the sizes I want to reach. I have access to better hardware, would you recommend a better cpu or GPU implementation? – user2324712 Nov 14 '15 at 08:19
  • @user2324712 I checked the profiler for @Divakar 's version and a variant using meshgrid: `[jj,ii] = meshgrid(uint32(1:m)); [I,J] = find(jj >= ii);`. Likely `bsxfun` is faster but at least in my case most of the time is actually spent in the `median` function which is a builtin MATLAB function. – mikkola Nov 14 '15 at 08:56
  • @user2324712 It seems and as also confirmed by mikkola that most of the time is spent with `median`. So, you can boost up the performance by a bit by doing the division by `2` after reduction by `median`. Thus, you could do `W = (xi+xj); HLe = median(W)./2;`. See if and how that helps? – Divakar Nov 14 '15 at 09:37
  • That shaved off about a second. I guess the only way now is using mex or GPU. thanks. – user2324712 Nov 14 '15 at 09:58
  • @user2324712 I am afraid so! – Divakar Nov 14 '15 at 10:00