9

Working in Matlab I have 2 vectors of x coordinate with different length. For example:

xm = [15 20 24 25 26 35 81 84 93];
xn = [14 22 26 51 55 59 70 75 89 96];

I need to map xm to xn, or in other words to find which coordinates in xn are closest to xm. So if I have values associated with those coordinates, I can use this map as index and correlate those values.

Both vectors are sorted and there are no duplicates in each vector.

I wrote a simple function with for-loop:

function xmap = vectors_map(xm,xn)
xmap = zeros(size(xm));
for k=1:numel(xm)
    [~, ind] = min(abs(xm(k)-xn));
    xmap(k) = ind(1);
end

For the above example is returns

xmap =
    1     2     2     3     3     3     8     9    10

It works ok, but takes a while with long vectors (over 100,000 points).

Any ideas how to vectorize this code?

yuk
  • 19,098
  • 13
  • 68
  • 99

6 Answers6

5

Oh! One other option: since you're looking for close correspondences between two sorted lists, you could go through them both simultaneously, using a merge-like algorithm. This should be O(max(length(xm), length(xn)))-ish.


match_for_xn = zeros(length(xn), 1);
last_M = 1;
for N = 1:length(xn)
  % search through M until we find a match.
  for M = last_M:length(xm)
    dist_to_curr = abs(xm(M) - xn(N));
    dist_to_next = abs(xm(M+1) - xn(N));

    if dist_to_next > dist_to_curr
      match_for_xn(N) = M;
      last_M = M;
      break
    else
      continue
    end

  end % M
end % N

EDIT: See @yuk's comment, the above code is not totally correct!

rescdsk
  • 8,739
  • 4
  • 36
  • 32
  • 2
    Great! This code gives me over 50 times speed improvement with 10,000-length vectors, and 1500(!) times with 100,000-lenth vectors. It can return error if several last elements of xn mapped to xm(end). I just changed the lines 6-7 to: if M – yuk Jan 28 '10 at 18:22
  • Cool! Yay! I'm glad it's working for you! Yeah, that's one of the fun things about computer science, when you suddenly make something a zillion times faster... – rescdsk Jan 28 '10 at 18:34
4

Consider this vectorized solution:

[~, xmap] = min( abs(bsxfun(@minus, xm, xn')) )
Amro
  • 123,847
  • 25
  • 243
  • 454
  • Nice vectorization. Thanks. However, it about twice slower then my function and also requires more memory, but better then previous code. – yuk Jan 28 '10 at 18:37
3

The fastest implementation I'm aware of that solves this problem is this one (C code that can be compiled as a .mex file; for me it's about 20 times faster than rescdsk's code in the accepted answer). It's surprising that such a common operation is not a MATLAB built-in function.

Matt Mizumi
  • 1,193
  • 1
  • 11
  • 27
1

It looks like your input vectors are sorted. Use a binary search to find the closest match. This will give you a O(n ln n) run time.

3Dave
  • 28,657
  • 18
  • 88
  • 151
0

Your xm and xn are sorted. If this is generally the case, then you can do much better than stepping over the entire array.

For each value in xn, there will be a range of values for which a value in xm will be closer to that number than any other. Compute these intervals beforehand and you can then step through both arrays sequentially.

John
  • 15,990
  • 10
  • 70
  • 110
0

Taking advantage of being sorted, as David says, will be faster since you have so many points, but for reference one way to vectorize this would be to use meshgrid:

[X Y] = meshgrid(xn, xm);
diffs = X - y;
mins = min(diffs, [], 2);

Note that this will create two 100,000 x 100,000 arrays in memory, so it's probably only feasible for smaller data sets.

rescdsk
  • 8,739
  • 4
  • 36
  • 32
  • Yeh, it takes a lot of memory and much slower then my function with small vectors. – yuk Jan 28 '10 at 18:35