sin and cos are slow, is there an alternatve?

Question

My game needs to move by a certain angle. To do this I get the vector of the angle via sin and cos. Unfortunately sin and cos are my bottleneck. I'm sure I do not need this much precision. Is there an alternative to a C sin & cos and look-up table that is decently precise but very fast?

I had found this:

float Skeleton::fastSin( float x )
{
    const float B = 4.0f/pi;
    const float C = -4.0f/(pi*pi);

    float y = B * x + C * x * abs(x);

    const float P = 0.225f;

    return P * (y * abs(y) - y) + y; 
}

Unfortunately, this does not seem to work. I get significantly different behavior when I use this sin rather than C sin.

Thanks

Where does the deviation from the standard sine function start? — James McLeod, May 23 '11 at 01:26
If the C sin and cos functions are genuinely too slow for your game, how often per frame do you need to call them? Are you sure you aren't calculating the sine and cosine of the same angle repeatedly? — , May 23 '11 at 04:15

rerun · Answer 1 · 2011-05-23T02:11:45.880

8

A lookup table is the standard solution. You could Also use two lookup tables on for degrees and one for tenths of degrees and utilize sin(A + B) = sin(a)cos(b) + cos(A)sin(b)

edited May 23 '11 at 02:11

answered May 23 '11 at 00:57

rerun

25,014
6
48
78

How could I do this in an object oriented way? Would a singleton be a good idea here? – jmasterx May 23 '11 at 01:00
4

Almost never a good idea. Static methods on MathUtils would be fine. – duffymo May 23 '11 at 01:05
6

The idea of a Singleton being object orientated is a joke. More importantly, what on earth would the object be in the case? – Puppy May 23 '11 at 01:22
I also wonder how can a global variable(Singleton) be OO? Anyway, if something as calling a sin/cos function is not being fast enough, I wonder if you really want something OO.. – devoured elysium May 23 '11 at 01:54
13

There is nothing oo about sin – rerun May 23 '11 at 02:05
As I was told recently, a lookup table can be slower than a simple calculation, depending how many layers of the memory hierarchy (cache etc) are involved. That fastsin function may well outpace grabbing a page in from RAM. Other than being integer, the table lookup code wouldn't be much less complex, so probably doesn't justify that memory access. – May 23 '11 at 04:10
You never know without actual trying and profiling the code in question ;) – Kromster Apr 09 '12 at 10:17

Brendan Long · Accepted Answer · 2011-05-23T23:55:29.730

7

For your fastSin(), you should check its documentation to see what range it's valid on. The units you're using for your game could be too big or too small and scaling them to fit within that function's expected range could make it work better.

EDIT:

Someone else mentioned getting it into the desired range by subtracting PI, but apparently there's a function called fmod for doing modulus division on floats/doubles, so this should do it:

#include <iostream>
#include <cmath>

float fastSin( float x ){
    x = fmod(x + M_PI, M_PI * 2) - M_PI; // restrict x so that -M_PI < x < M_PI
    const float B = 4.0f/M_PI;
    const float C = -4.0f/(M_PI*M_PI);

    float y = B * x + C * x * std::abs(x);

    const float P = 0.225f;

    return P * (y * std::abs(y) - y) + y; 
}

int main() {
    std::cout << fastSin(100.0) << '\n' << std::sin(100.0) << std::endl;
}

I have no idea how expensive fmod is though, so I'm going to try a quick benchmark next.

Benchmark Results

I compiled this with -O2 and ran the result with the Unix time program:

int main() {
    float a = 0;
    for(int i = 0; i < REPETITIONS; i++) {
        a += sin(i); // or fastSin(i);
    }
    std::cout << a << std::endl;
}

The result is that sin is about 1.8x slower (if fastSin takes 5 seconds, sin takes 9). The accuracy also seemed to be pretty good.

If you chose to go this route, make sure to compile with optimization on (-O2 in gcc).

edited May 23 '11 at 23:55

answered May 23 '11 at 00:53

Brendan Long

53,280
21
146
188

1

Indeed, running this little program on my calculator, I found it gives decent enough results between 0 and 3.2, but much beyond that and it goes very wrong very fast. :) – sarnold May 23 '11 at 00:59
2

That fastsin fucntion is only valid for +/-pi. I use a version of it in fixed point (16 bit integers) and get about 10 bits precision over a full circle. It has perfect symmetry and hits 0 and 1 exactly which is important to my app. – phkahler May 23 '11 at 14:25
The constant `0.225f` is not exact, so I don't think `fastSin` can hit anything exactly... – R.. GitHub STOP HELPING ICE May 24 '11 at 01:37
@R, plug it in and compile it yourself. phkahler's comment is correct. – Brendan Long May 24 '11 at 02:01
1

sinf() function works around 1.6x faster than fastSin here is the code i did benchmark ` float a = 0.0f; for(int j =0;j < 10;j++) { DWORD start = timeGetTime(); for(int i = 0; i < 10000000;i++) { a = // sinf(a); fastSin(a);; a+=0.107f; } cout< – hevi Oct 17 '12 at 14:16
You can also calculate the cosine of an angle with the sine of an angle using this rule: cos^2(x) + sin^2(x) = 1, so that cos(x) = sqrt(1-sin^2(x)) and sin(x) = sqrt(1-cos^2(x)) – user274595 May 06 '16 at 23:32

JMRC · Answer 3 · 2015-10-05T15:49:47.033

6

I know this is already an old topic, but for people who have the same question, here is a tip.

A lot of times in 2D and 3D rotation, all vectors are rotated with a fixed angle. In stead of calling the cos() or sin() every cycle of the loop, create variable before the loop which contains the value of cos(angle) or sin(angle) already. You can use this variable in your loop. This way the function only has to be called once.

edited Oct 05 '15 at 15:49

answered Aug 30 '13 at 19:05

JMRC

1,473
1
17
36

1

Not every compiler will optimize this. Some image manipulation functions I used took over ten seconds in debug mode and only a fraction a second in release mode. It was very inconvenient if you use that function a lot. Try not to rely too much on a compiler optimizing your code when you can easily do it yourself. – JMRC Oct 05 '15 at 15:56
1

Are you kidding? So you are complaining because a compiler does not make optimization a in debug mode? so what? Learn about strength reduction and common subexpressions... Seriously, you will not use a debug compilation in production or will you ? – Dredok Oct 05 '15 at 21:51
2

No, but I'd rather spend one hour of debugging instead of 10 or 1 day of debugging instead of a week. Especially if rewriting the code takes less than a minute. There's nothing wrong with keeping you're code clean. It's good practice. – JMRC Oct 06 '15 at 11:59

Michael Anderson · Answer 4 · 2021-05-28T01:41:28.500

If you rephrase the return in fastSin as

return (1-P) * y + P * (y * abs(y))

And rewrite y as (for x>0 )

y = 4 * x * (pi-x) / (pi * pi)

you can see that y is a parabolic first-order approximation to sin(x) chosen so that it passes through (0,0), (pi/2,1) and (pi,0), and is symmetrical about x=pi/2. Thus we can only expect our function to be a good approximation from 0 to pi. If we want values outside that range we can use the 2-pi periodicity of sin(x) and that sin(x+pi) = -sin(x).

The y*abs(y) is a "correction term" which also passes through those three points. (I'm not sure why y*abs(y) is used rather than just y*y since y is positive in the 0-pi range).

This form of overall approximation function guarantees that a linear blend of the two functions y and y*y, (1-P)*y + P * y*y will also pass through (0,0), (pi/2,1) and (pi,0).

We might expect y to be a decent approximation to sin(x), but the hope is that by picking a good value for P we get a better approximation.

One question is "How was P chosen?". Personally, I'd chose the P that produced the least RMS error over the 0,pi/2 interval. (I'm not sure that's how this P was chosen though)

Minimizing this wrt. P gives

This can be rearranged and solved for p

Wolfram alpha evaluates the initial integral to be the quadratic

E = (16 π^5 p^2 - (96 π^5 + 100800 π^2 - 967680)p + 651 π^5 - 20160 π^2)/(1260 π^4)

which has a minimum of

min(E) = -11612160/π^9 + 2419200/π^7 - 126000/π^5 - 2304/π^4 + 224/π^2 + (169 π)/420 
       ≈ 5.582129689596371e-07

at

p = 3 + 30240/π^5 - 3150/π^3 
  ≈ 0.2248391013559825

Which is pretty close to the specified P=0.225.

You can raise the accuracy of the approximation by adding an additional correction term. giving a form something like return (1-a-b)*y + a y * abs(y) + b y * y * abs(y). I would find a and b by in the same way as above, this time giving a system of two linear equations in a and b to solve, rather than a single equation in p. I'm not going to do the derivation as it is tedious and the conversion to latex images is painful... ;)

NOTE: When answering another question I thought of another valid choice for P. The problem is that using reflection to extend the curve into (-pi,0) leaves a kink in the curve at x=0. However, I suspect we can choose P such that the kink becomes smooth. To do this take the left and right derivatives at x=0 and ensure they are equal. This gives an equation for P.

Implemented this for the improved version over the 0-pi range, with comparison to the existing fast_sin approximation. code here: http://gist.github.com/988459 , live version here: http://ideone.com/oWdkW . Looks like it gets two more decimal places of accuracy at the cost of one additional mult and add. — Michael Anderson, May 24 '11 at 10:09

score 3 · Answer 5 · answered May 23 '11 at 02:25

3

You can compute a table S of 256 values, from sin(0) to sin(2 * pi). Then, to pick sin(x), bring back x in [0, 2 * pi], you can pick 2 values S[a], S[b] from the table, such as a < x < b. From this, linear interpolation, and you should have a fair approximation

memory saving trick : you actually need to store only from [0, pi / 2], and use symmetries of sin(x)
enhancement trick : linear interpolation can be a problem because of non-smooth derivatives, humans eyes is good at spotting such glitches in animation and graphics. Use cubic interpolation then.

answered May 23 '11 at 02:25

Monkey

1,838
1
17
24

But the sine and cosine functions are infinitely differentiable, so there's no such thing as a non-smooth derivative with these functions. – duffymo May 23 '11 at 13:13
1

@duffymo - I'm guessing it means that the piecewise-linear-interpolated approximation of the sine has a non-smooth derivative, with sudden changes at the points where one line joins ends and the next starts. One way to get a smooth interpolation is to use a cubic rather than linear interpolation, derived so that the end-points have the correct derivatives. The principle is similar to smooth Bezier points in vector graphics, but the math is probably simpler (I seem to remember Bezier math being a pain). – May 23 '11 at 18:21

score 2 · Answer 6 · edited May 30 '12 at 02:09

2

What about

x*(0.0174532925199433-8.650935142277599*10^-7*x^2)

for deg and

x*(1-0.162716259904269*x^2)

for rad on -45, 45 and -pi/4 , pi/4 respectively?

edited May 30 '12 at 02:09

Jay D

3,263
4
32
48

answered Apr 06 '12 at 22:14

aaa

21
1

score 1 · Answer 7 · answered Jan 29 '14 at 15:33

1

you can use this aproximation.

this solution use a quadratic curve :

http://www.starming.com/index.php?action=plugin&v=wave&ajax=iframe&iframe=fullviewonepost&mid=56&tid=4825

answered Jan 29 '14 at 15:33

Jefferson Rondan

103
1
10

James McLeod · Answer 8 · 2011-05-23T01:17:37.247

1

This (i.e. the fastsin function) is approximating the sine function using a parabola. I suspect it's only good for values between -π and +π. Fortunately, you can keep adding or subtracting 2π until you get into this range. (Edited to specify what is approximating the sine function using a parabola.)

edited May 23 '11 at 01:17

answered May 23 '11 at 01:06

James McLeod

2,381
1
17
19

Utterly incorrect. The sine function is not a parabola, especially over the range you cite. Plot the sine and a parabola over that range and you'll see. – duffymo May 23 '11 at 01:14
1

Hah! I meant that the quoted function is a parabola. Two parabolas, actually. – James McLeod May 23 '11 at 01:16
Note that the function is not a parabola, it *uses* a parabola - which is why its range is limited to the first half-period in either direction from 0. – James McLeod May 23 '11 at 01:19
1

And whether it is a good idea or not depends on whether it provides sufficient accuracy and speed for the given application - it is difficult for anyone but the OP to determine this. – James McLeod May 23 '11 at 01:20
@James - but surely perfection is essential in all things. Floating point is just a story for terrifying young geeks - works best when combined with the "incompleteness theorem" and "quantum uncertainty" myths and a mild overdose of caffeine and sugar. For those few geeklets who still have dry underwear, finish your story with "chaos theory". – May 23 '11 at 04:02
@Steve314 - if I thought perfection were essential in all things, I doubt I could be an effective software developer... – James McLeod May 23 '11 at 04:06
1

@James - and by implication, I'm an ineffective developer, I suppose. How dare you! May all your functions have no closed form! – May 23 '11 at 04:24
Now that is just about the geekiest curse I have ever had hurled at me. I approve utterly. – James McLeod May 23 '11 at 12:21
If you look at the Taylor series expansion of the sine function, you'll see that it's odd, and only odd powers of x appear. So why would you recommend using a parabola to interpolate, which is a second-order function? – duffymo May 23 '11 at 13:15
2

I have made no recommendation, sir. I have pointed out that the suggested function (fastSin) uses a parabola, and hence it likely has only a limited range. – James McLeod May 23 '11 at 13:33
1

@duffymo Taylor series is not necessarily the best approximation over a range of values. For that you need something more sophisticated. And for `sin(x)` over the range 0-pi/2 the parabola `4*x*(pi-x)/pi*pi` does a very good job. – Michael Anderson May 24 '11 at 00:33
1

Actually the function specified seems to be a specially chosen quartic ( y is quadratic, so y^2 is quartic ). Its chosen to preserve certain properties of the sin, in particular passing through (0,0), (pi/2,1), (pi,0) and reduce approximation error at the same time. This means it should give good values over [0,pi]. The use of abs(x) and abs(y) is to extend the region of validity to [-pi,pi] using the symmetry of sin. I suspect though that differentiability of the curve may fail at x=0. (Unless P was specifically chosen to maintain that property, rather than reduce global error). – Michael Anderson May 24 '11 at 01:02
Unfortunately, `fastSin(x)` does not approximate `x` for small values of `x`... That's rather bad.. – R.. GitHub STOP HELPING ICE May 24 '11 at 01:36
R - It depends on what you're trying to do. The person asking the question obvious is more concerned with speed than accuracy. – Brendan Long May 26 '11 at 17:54

sin and cos are slow, is there an alternatve?

8 Answers8

Linked