2

I was wondering how I would go about using __cos(x) (and respectively __sin(x)) in the kernel code with CUDA. I looked up in the CUDA manual that there is such a device function however when I implement it the compiler just says that I cannot call a host function in the device.

However, I found that there are two sister functions cosf(x) and __cosf(x) the latter of which runs on the SFU and is overall much faster than the original cosf(x) function. The compiler does not complain about the __cosf(x) function of course.

Is there a library I'm missing? Am I mistaken about this trig function?

Bart
  • 19,692
  • 7
  • 68
  • 77
harmonickey
  • 1,299
  • 2
  • 21
  • 35

1 Answers1

4

As the SFU only supports certain single-precision operations, there are no double-precision __cos() and __sin() device functions. There are single-precision __cosf() and __sinf() device functions, as well as other functions detailed in table C-4 of the CUDA 4.2 Programming Manual.

I assume you are looking for faster alternatives to the double-precision versions of the standard math functions sin() and cos()? If sine and cosine of the same argument are needed, sincos() should be used for a significant performance boost. If the argument of sine or cosine is multiplied by π, you would want to use sinpi(), cospi(), or sincospi() instead, for even more performance. For example, sincospi() is very useful when implementing the Box-Muller algorithm for generating normally distributed random numbers. Also, check out the CUDA 5.0 preview for best possible performance (note that the preview provides alpha-release quality).

njuffa
  • 23,970
  • 4
  • 78
  • 130
  • Does this sincos() function compute the value by sin AND cos separately? Or does it do sin(cos(x)) or cos(sin(x))? Or something different? – harmonickey Jul 19 '12 at 20:51
  • 1
    sincos(x) returns both sin(x) and cos(x) simultaneously. The combined computation is significantly faster than computing sin() and cos() separately. Similarly, sincospi(x) [added in CUDA 5.0] computes sin(π*x) and cos(π*x) faster than separate calls to sinpi() and cospi(x). It is also faster than manually computing the results via sincos(). The function signature is sincos(double arg, double *sine_of_arg, double *cos_of_arg). You can also do a `man sincos' on any Linux system. There are sorresponding single-precision versions sincosf() and sincospif(), of course. – njuffa Jul 19 '12 at 22:08