0

I'm currently constructing a mathematical model in C++, which will be run/processed on a graphics card. It currently uses a lot of functions from math.h, such as sin, cos, sqrt and pow.

"Avoid using calls to external libraries such as math.h," said my manager, "Because it's expensive on the graphics card. So you can't use sin or cos functions. Sqrt is probably fine, but, if you can, use a polynomial function for the trigonometry."

I have a number of questions arising from this. Sure, you can use Taylor series to work out sin and cos, but...

  • Why is sqrt "probably fine" and sin/cos not? (And if sqrt is fine, is pow fine too?)
    • EDIT: I likely misunderstood my manager here as meaning 'using sqrt from math.h is fine' when he likely meant 'calculating a square root is fine' using some other method
  • How costly is it to access a library from a graphics card, and why is it costly? (I'm assuming it's run time cost and not compile time cost)
  • How would implementing my own sin/cos/sqrt/pow functions compare cost-wise to calling from the library, given they're likely to be far less optimised?

Apologies if this question is somewhat vague; alas, both my manager and my team's senior dev are away, so I'm a little short on people to ask. If I've just misunderstood because none of what I thought my manager said makes sense, that's helpful in and of itself. Cheers!

Possibly related:

Community
  • 1
  • 1
ELRG
  • 590
  • 4
  • 15
  • 1
    Your manager makes good points: sqrt can be thumped out in a handful of clock cycles using Newton Raphson or similar. You can't apply the same with the the trig functions. pow depends on whether or not the arguments are integral or not. – Bathsheba Mar 20 '17 at 14:55
  • @Bathsheba Ah, I took the meaning as 'using sqrt from math.h is ok' not 'finding a square root is OK because you can easily implement an efficient algorithm for it', which makes more sense - that's point 1 on the list answered, then. :) – ELRG Mar 20 '17 at 15:05
  • _"...run/processed on a graphics card"_ you need to investigate how the graphics card process the functions you are interested in. And what effect calling them has on the graphics card parallel instruction pipeline. – Richard Critten Mar 20 '17 at 15:07
  • Personally though I believe that "normal" multicore programming (which C++11 makes easier than older standards) will win over gpgpu in the long run. – Bathsheba Mar 20 '17 at 15:10
  • @Bathsheba That run may be *very* long, though. It's of course much closer in the top performance circles where money and power are of little concern, but I can't see several-thousand-core CPUs coming close to the cost and availability of several-thousand-core GPUs (currently pretty much mainstream) any time soon. – Angew is no longer proud of SO Mar 20 '17 at 15:15
  • I'm going to go out on a limb, and interpret your manager's comments as: "Don't use trigonometry, figure out a way of doing it differently" rather than "Our compiler vendor's implementation of math.h is shockingly bad". – Caleth Mar 20 '17 at 15:31
  • I do not know if it is still valid, but the main constraint could be the memory pipeline. You would need to get the argument from the graphics card memory to the main memory to the processor cache to a processor register and the result the reverse way. -- For the trig functions look-up tables with linear or quadratic interpolation may be sufficient and faster than polynomial approximations of the same precision. – Lutz Lehmann Mar 20 '17 at 18:01
  • @LutzL That could be it! I'd forgotten about look-up tables as well, I'll look into that. – ELRG Mar 21 '17 at 08:23

0 Answers0