2

I am reimplementing a Matlab function in C for performance reasons. Now, I am looking for the most efficient way to compute the projection of a vector onto the unit-box.

In C terms, I want to compute

double i = somevalue;
i = (i >  1.) ?  1. : i;
i = (i < -1.) ? -1. : i;

and since I have to do this operation several millions of times I wonder what could be the most efficient way to achieve this.

Carsten
  • 21
  • 2
  • As far as I know, some CPU architectures have special instructions for limiting a value like that; a decent C compiler should detect that it can use such an instruction and optimize it. If you're trying to solve the problem for higher-dimensional vector spaces, of course the details depend on your vector norm. – Jan Krüger Jul 05 '11 at 14:28
  • Possible duplicate: http://stackoverflow.com/questions/427477/fastest-way-to-clamp-a-real-fixed-floating-point-value – Alexandre C. Jul 05 '11 at 14:37
  • @alexandre-c Yes, I've asked the same question so this is a duplicate. Unfortunately, I couldn't find the original question. – Carsten Jul 05 '11 at 14:42

3 Answers3

2

If you're on 686, your compiler will likely transform the conditional into a CMOV instruction, which is probably fast enough.

See the question Fastest way to clamp a real (fixed/floating point) value? for experiments. @Spat also suggests the MINSS/MINSD and MAXSS/MAXSD instructions, which can be available as intrinsics for your compiler. They are SSE instructions, and may be your best choice, again, provided you're on 686.

Community
  • 1
  • 1
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • +1, re CMOV: That would depend on processor architecture and CPU scheduling (http://ondioline.org/mail/cmov-a-bad-idea-on-out-of-order-cpus) – sehe Jul 05 '11 at 15:15
1

If you/"the compiler" use(s) the IEEE 754 double format, I'd think reading the first bit (the sign bit) of the double's memory is probably the most direct way. Then you'd have no additional round or division operations needed.

rubenvb
  • 74,642
  • 33
  • 187
  • 332
  • I'd be curious to see a performance comparison with and without this "optimization"! – Kerrek SB Jul 05 '11 at 14:32
  • @Kerrek SB: yeah, me too, but it seems like the only one that could possibly beat the compiler's tricks to do the same. I wrote this answer in he assumption the OP's code was slow (i.e. not the fastest possible). – rubenvb Jul 05 '11 at 14:36
  • @Carsten: then let the compiler do its job. It knows a lot more than you do. – rubenvb Jul 06 '11 at 10:53
0

Did you consider using SSE instructions to speed up your code?

Also you could use OpenMP to parallelize you code, and thus making it faster.

Constantinius
  • 34,183
  • 8
  • 77
  • 85