-3

I have n points in a 2D plane. I want to calculate the distance between each two points in c++. Position of m'th point in the plan is (x(m),y(m)). This points changes during passing time. The number of time steps is equal to 10^5. I wrote below code, but as n is a big number(5000) and I want to find the distance between points 10^5 times, I'm searching for the most optimized way to do that. Could anyone tell me what is the least time-consuming way to do that?

for(i=1;n;++)
   for(j=1;n;++)
      if (i>j)
         r(i,j)= r(j,i);
      else
         r(i,j)=sqrt((x(i)-x(j))^2+(y(i)-y(j))^2);
      end
   end
end

I know that, in Matlab, I can find this by using bsxfun function. I want also to know which one could calculate distances faster? Matlab or c++?

EBH
  • 10,350
  • 3
  • 34
  • 59
Oliver Range
  • 365
  • 3
  • 10
  • Even if you can avoid the `if` complexity will still be `O(n²)`. – Jarod42 Jun 05 '17 at 08:23
  • You want to compute distances between all possible pairs of points. It'll be O(N^2) anyway. You can try to parallelize it to make things faster. – CaptainTrunky Jun 05 '17 at 08:24
  • Yes, it is not a good way and I'm searching for the best way. Could you help me please? @Jarod42 – Oliver Range Jun 05 '17 at 08:25
  • Do you really need all the full distances? You can usually make do with just the square of the distance for most comparison purposes, saving you calling the expensive sqrt – UKMonkey Jun 05 '17 at 08:28
  • Good suggestion. Could you please see the edit to my question? @UKMonkey – Oliver Range Jun 05 '17 at 08:31
  • 3
    The question regarding which is faster - matlab or c++? Depends on your algorithm really. You're asking the wrong questions I fear. Perhaps you should rewind and try explaining why you need all the 10^5 distances in the first place? – UKMonkey Jun 05 '17 at 08:35
  • Done! is it clear now? @UKMonkey – Oliver Range Jun 05 '17 at 08:43
  • If `n` is large enough, it might be faster get rid of your `if (i > j)` check and always compute the square root, because of cache misses. Better would be to compute the values on the upper triangle and use a block matrix algorithm to copy values to the other half. –  Jun 05 '17 at 11:55
  • ... I say this to demonstrate that writing high performance code is *very hard*, at least in part because requires that it requires very unfamiliar concerns. If you have a good library, you generally shouldn't try to outperform it unless you really have a good reason and know what you're doing. If you don't have that, better to use the library -- or learn how to use the library better. –  Jun 05 '17 at 11:57
  • Do you know a webpage that I can download efficient libraries of c++? @Hurkyl – Oliver Range Jun 05 '17 at 13:27

1 Answers1

0

Regarding Matlab, you have also pdist which does exactly that (but is not so fast), and you should also read this.


About comparing Matlab and C first read this and this. Also keep in mind that Matlab, as a desktop program, requires knowing not only a general efficient way to implement your code but also the right way to do this in Matlab. One example is the difference between functions. Built-in functions are written in FORTRAN or C and run much faster then non-built-in functions. To know if a function is built-in, type:

which function_name

and check if you see "built-in" at the start of the output:

built-in (C:\Program Files\MATLAB\...)
EBH
  • 10,350
  • 3
  • 34
  • 59