Yet another fast trigonometry

Question

float sinx(float x)
{
    static const float a[] = {-.1666666664,.0083333315,-.0001984090,.0000027526,-.0000000239};
    float xsq = x*x;
    float temp = x*(1 + a[0]*xsq + a[1]*xsq*xsq + a[2]* xsq*xsq*xsq+a[3]*xsq*xsq*xsq*xsq+ a[4]*xsq*xsq*xsq*xsq*xsq);
    return temp;
}

How are those constants calculated? How to calculate cos and tan using this method? Can I extend this to get more precision? I guess I need to add more constants?

Plot of the error of the "fast" sine described above against a Taylor polynomial of equal degree.

Note that the taylor approximation actually gets worse for |x|>1 when increasing the degree (number of constants). This can be remedied by using that sin(x) has a period of 2π — margnus1, Dec 07 '12 at 22:03
OK then I am fine with those constants. I just plotted error graph and precision was just fine. Now I want to figure out if `tan` and `cos` can be calculated the same way. In the formula that I have to calculate on 8 bit slow MCU I have `tan` and `cos`. Division is very slow so prefer to avoid it. — Pablo, Dec 07 '12 at 22:06
@Pablo You don't need division, you can convert it to a multiplication by pre-computing the terms, like in the code above where a division by 6 is replaced by a multiplication by 0.16666... — Andrei, Dec 07 '12 at 22:10
It's true if you know the divisor. I was talking about `tan=sin/cos` where you don't know divisor. Unless I missed your point. — Pablo, Dec 07 '12 at 22:13
@Pablo The expansion for `tan` is x+x^3/3+(2 x^5)/15+..., you don't need to calculate it as `sin/cos`. — Andrei, Dec 07 '12 at 22:14
@Andrei agree, thanks to Wolfram Alpha providing terms and thanks to you. — Pablo, Dec 07 '12 at 22:18
See also [Fast inacccurate sin function without lookup](http://stackoverflow.com/questions/7163516/fast-inaccurate-sin-function-without-lookup/) — Jonathan Leffler, Dec 08 '12 at 00:14
[Numerical Recipes](http://nr.com) is a decent starting point to learn the theory behind these kind of algorithms; it has some decent info and some good references. You may want to take a look at the Cephes library, kept at the [Netlib archive](http://netlib.org): http://netlib.org/cephes/ The whole Netlib archive is an underused treasure trove of numerical goodness. — sfstewman, Dec 08 '12 at 01:24

Pascal Cuoq · Answer 1 · 2014-03-05T17:44:31.150

8

Nearly all answers at the time of this writing refer to the Taylor expansion of function sin, but if the author of the function were serious, he would not use Taylor coefficients. Taylor coefficients tend to produce a polynomial approximation that's better than necessary near zero and increasingly bad away from zero. The goal is usually to obtain an approximation that is uniformly good on a range such as -π/2…π/2. For a polynomial approximation, this can be obtained by applying the Remez algorithm. A down-to-earth explanation is this post.

The polynomial coefficients obtained by that method are close to the Taylor coefficients, since both polynomials are trying to approximate the same function, but the polynomial may be more precise for the same number of operations, or involve fewer operations for the same (uniform) quality of approximation.

I cannot tell just by looking at them if the coefficients in your question are exactly Taylor coefficients or the slightly different coefficients obtained by the Remez algorithm, but it is probably what should have been used even if it wasn't.

Lastly, whoever wrote (1 + a[0]*xsq + a[1]*xsq*xsq + a[2]* xsq*xsq*xsq+a[3]*xsq*xsq*xsq*xsq+ a[4]*xsq*xsq*xsq*xsq*xsq) needs to read up about better polynomial evaluation schemes, for instance Horner's:

1 + xsq*(a[0]+ xsq*(a[1] + xsq*(a[2] + xsq*(a[3] + xsq*a[4])))) uses N multiplications instead of N²/2.

edited Mar 05 '14 at 17:44

answered Dec 07 '12 at 22:34

Pascal Cuoq

79,187
7
161
281

It was implemented and improved at ENS Lyon (nudging the coefficients a little bit more so that floating-point errors tend to improve, rather than worsen, the result): http://perso.ens-lyon.fr/florent.de.dinechin/recherche/publis/2008-RNC.pdf They want to release it, but they are afraid it's too difficult to use. :) – Pascal Cuoq Dec 07 '12 at 22:43
Could you introduce some correction into original function with Horner's rule? – Pablo Dec 07 '12 at 23:13
@Pablo I have updated my question with the full Horner evaluation of a 4th-degree polynomial in `xsq`. This should compute almost the same as the code in your question, more efficiently. – Pascal Cuoq Dec 07 '12 at 23:20
@Pablo A good single-precision implementation for `sinf()` is here. Additional coefficients would not help because the precision is limited by the single-precision format of the result. https://github.com/JuliaLang/openlibm/blob/master/src/k_sinf.c – Pascal Cuoq Dec 07 '12 at 23:29
Looks very interesting and I am going to benchmark this on my MCU. Actually I need only tan and cos, because I'm using Rhumb formula for geo distance. I tried using LUT, then with suitable precision table was too big to fit into flash memory. Original sin takes too much cycles. It's been days I am looking for suitable solution. The worst thing is that my float is 4 bytes. I may need to convert whatever solution I find into fixed point. Total nightmare. – Pablo Dec 07 '12 at 23:36
I've checked the `k_sinf` and it's pretty close to what I want, need to try converting to fixed point. However, it has some error, which is on the edge of acceptance. Could you please let me know if it's possible to increase accuracy by just digit or two? Thanks for your time. – Pablo Dec 08 '12 at 00:37
@technosaurus I do not see what that library has to do with the question or the answer. – Pascal Cuoq Dec 14 '18 at 09:12
If you search the code for sin or cos it uses this method and describes it in the comments along with links to a page describing remez and sources for a utility to generate the constants. – technosaurus Dec 14 '18 at 14:39
Ah yes, you mean the implementation at https://github.com/vurtun/nuklear/blob/cd1624e3aa33a771c45afa182934ef2ef79cdec5/src/nuklear_math.c#L53 . The linked implementation of the Remez algorithm, lolremez, is what I use too. Although if the author of the line https://github.com/vurtun/nuklear/blob/cd1624e3aa33a771c45afa182934ef2ef79cdec5/src/nuklear_math.c#L56 had read the lolremez tutorial up to section 4, they might have considered pinning that coefficient to 0 (for instance): https://github.com/samhocevar/lolremez/wiki/Tutorial-4-of-5%3A-fixing-lower-order-parameters – Pascal Cuoq Dec 14 '18 at 15:15

Esailija · Accepted Answer · 2012-12-07T22:25:56.087

3

They are -1/6, 1/120, -1/5040 .. and so on.

Or rather: -1/3!, 1/5!, -1/7!, 1/9!... etc

Look at the taylor series for sin x in here:

enter image description here

It has cos x right below it:

enter image description here

For cos x, as seen from the picture above, the constants are -1/2!, 1/4!, -1/6!, 1/8!...

tan x is slightly different:

enter image description here

So to adjust this for cosx:

float cosx(float x)
{
    static const float a[] = {-.5, .0416666667,-.0013888889,.0000248016,-.0000002756};
    float xsq = x*x;
    float temp = (1 + a[0]*xsq + a[1]*xsq*xsq + a[2]* xsq*xsq*xsq+a[3]*xsq*xsq*xsq*xsq+ a[4]*xsq*xsq*xsq*xsq*xsq);
    return temp;
}

edited Dec 07 '12 at 22:25

answered Dec 07 '12 at 22:00

Esailija

138,174
23
272
326

is it `a[0]*xsq` or `a[0]*x*x*x`, etc. for `cos`? – Pablo Dec 07 '12 at 23:09
@Pablo it is for cos exactly as given, removing x* from the sin version on that part is enough. The sin is ahead by 1 power so that's why it's multiplied.. x(x² + x⁴ ..) is same as x³ + x⁵ ... – Esailija Dec 08 '12 at 00:27

Joseph Quinsey · Answer 3 · 2012-12-08T19:47:42.650

The coefficients are identical to those given in the Handbook of Mathematical Functions, ed. Abramowitz and Stegan (1964), page 76, and are attributed to Carlson and Goldstein, Rational approximations of functions, Los Alamos Scientific Laboratory (1955).

The first can be found in http://www.jonsson.eu/resources/hmf/pdfwrite_600dpi/hmf_600dpi_page_76.pdf.

And the second at http://www.osti.gov/bridge/servlets/purl/4374577-0deJO9/4374577.pdf. Page 37 gives:

enter image description here

Regarding your third question, "Can I extend this to get more precision?", http://lol.zoy.org/wiki/doc/maths/remez has a downloadable C++ implementation of the Remez algorithm; it provides (unchecked by me) the coefficients for the 6th-order polynomial for sin:

 error: 3.9e-14
 9.99999999999624e-1
-1.66666666660981e-1
 8.33333330841468e-3
-1.98412650240363e-4
 2.75568408741356e-6
-2.50266363478673e-8
 1.53659375573646e-10

Or course, you would need to change from float to double to realize any improvement. And this may also answer your second question, regarding cos and tan.

Also, I see in the comments that a fixed-point answer is required in the end. I implemented a 32-bit fixed-point version in 8031-assembler about 26 years ago; I'll try digging it up to see whether it has anything useful in it.

Update: If you are stuck with 32-bit doubles, then the only way I can see for you to increase the accuracy by a "digit or two" is to forget floating-point and use fixed-point. Surprisingly, google doesn't seem to turn up anything. The following code provides proof-of-concept, run on a standard Linux machine:

#include <stdio.h>
#include <math.h>
#include <stdint.h>

// multiply two 32-bit fixed-point fractions (no rounding)
#define MUL32(a, b) ((uint64_t)(a) * (b) >> 32)

// sin32:  Fixed-point sin calculation for first octant, coefficients from
//         Handbook for Computing Elementary Functions, by Lyusternik et al, p. 89.
// input:  0 to 0xFFFFFFFF, giving fraction of octant 0 to PI/8, relative to 2**32
// output: 0 to 0.7071, relative to 2**32

static uint32_t sin32(uint32_t x) { // x in 1st octant, = radians/PI*8*2**32
   uint32_t y, x2 = MUL32(x, x);    // x2 = x * x
   y = 0x000259EB;                  // a7 = 0.000 035 877 1
   y = 0x00A32D1E  - MUL32(x2, y);  // a5 = 0.002 489 871 8
   y = 0x14ABBA77  - MUL32(x2, y);  // a3 = 0.080 745 367 2
   y = 0xC90FDA73u - MUL32(x2, y);  // a1 = 0.785 398 152 4
   return MUL32(x, y);
}

int main(void) {
   int i;
   for (i = 0; i < 45; i += 2) { // 0 to 44 degrees
      const double two32 = 1LL << 32;
      const double radians = i * M_PI / 180;
      const uint32_t octant = i / 45. * two32; // fraction of 1st octant
      printf("%2d  %+.10f  %+.10f  %+.10f    %+.0f\n", i,  
         sin(radians) - sin32(octant) / two32,
         sin(radians) - sinf(radians),
         sin(radians) - (float)sin(radians),
         sin(radians) * two32 - sin32(octant));
   }
   return 0;
}

The coefficients are from the Handbook for Computing Elementary Functions, by Lyusternik et al, p. 89, here:

enter image description here

The only reason I choose this particular function is that it has one less term than your original series.

The results are:

 0  +0.0000000000  +0.0000000000  +0.0000000000    +0
 2  +0.0000000007  +0.0000000003  +0.0000000012    +3
 4  +0.0000000010  +0.0000000005  +0.0000000031    +4
 6  +0.0000000012  -0.0000000029  -0.0000000011    +5
 8  +0.0000000014  +0.0000000011  -0.0000000044    +6
10  +0.0000000014  +0.0000000050  -0.0000000009    +6
12  +0.0000000011  -0.0000000057  +0.0000000057    +5
14  +0.0000000006  -0.0000000018  -0.0000000061    +3
16  -0.0000000000  +0.0000000021  -0.0000000026    -0
18  -0.0000000005  -0.0000000083  -0.0000000082    -2
20  -0.0000000009  +0.0000000095  -0.0000000107    -4
22  -0.0000000010  -0.0000000007  +0.0000000139    -4
24  -0.0000000009  -0.0000000106  +0.0000000010    -4
26  -0.0000000005  +0.0000000065  -0.0000000049    -2
28  -0.0000000001  -0.0000000032  -0.0000000110    -0
30  +0.0000000005  -0.0000000126  -0.0000000000    +2
32  +0.0000000010  +0.0000000037  -0.0000000025    +4
34  +0.0000000015  +0.0000000193  +0.0000000076    +7
36  +0.0000000013  -0.0000000141  +0.0000000083    +6
38  +0.0000000007  +0.0000000011  -0.0000000266    +3
40  -0.0000000005  +0.0000000156  -0.0000000256    -2
42  -0.0000000009  -0.0000000152  -0.0000000170    -4
44  -0.0000000005  -0.0000000011  -0.0000000282    -2

Thus we see that this fixed-point calculation is about ten times more accurate than sinf() or (float)sin(), and is correct to 29 bits. Using rounding rather than truncation in MUL32() made only a marginal improvement.

Since you appear to have the book, could you say what properties are attached to this choice of coefficients by the authors? I thought the -.1666666664 (instead of -.1666666666 or -.1666666667) might be a reflection of the IEEE 754 floating-point nature of the coefficients, but if they were picked in 1955, surely not, so this may be a uniform polynomial approximation after all. — Pascal Cuoq, Dec 07 '12 at 23:16
@PascalCuoq: Looking at the paper, it does look like a uniform polynomial approximation, but Carlson and Goldstein describe their method as "essentially an empirical one" (page 12). — Joseph Quinsey, Dec 07 '12 at 23:55
Yes, the “it is […] hoped that the method may eventually be put on a firm mathematical basis” got a smile out of me. Wikipedia dates Remez algorithm as 1934. There are similarities but it isn't clear whether Carlson and Goldstein were aware of it. — Pascal Cuoq, Dec 07 '12 at 23:59
@JosephQuinsey Unfortunately double and float has the same size - 4 byte — Pablo, Dec 08 '12 at 08:46

Matteo Italia · Answer 4 · 2012-12-07T22:11:15.490

1

That function is calculating the value of sin using its Taylor expansion:

sin(x) Taylor expansion

and those constants are the various -1/3!, 1/5! and so on (see e.g. here for Taylor series of other functions).

Now, the Taylor expansion for sin(x) is exact for every x if you specified every term of the series, but AFAIK there are faster and more precise methods to determine the trigonometric functions in software.

Also, many processors provide such functions implemented directly in the processor (e.g. on x86 there are ready-made opcodes for them), so often there's no need to bother with this stuff.

edited Dec 07 '12 at 22:11

answered Dec 07 '12 at 22:04

Matteo Italia

123,740
17
206
299

I've tried CORDIC and get worse precision results. Using 8 bit AVR without trigonometric functions and slow division. – Pablo Dec 07 '12 at 22:07
From what I read, often CORDIC is used as a first step, and then a short Taylor series is used to polish the results. See e.g. [here](http://stackoverflow.com/questions/345085/how-do-trigonometric-functions-work/345117#345117). – Matteo Italia Dec 07 '12 at 22:10
Yea, I also read that. This is something to try... might be too much cycles for CORDIC+Taylor. – Pablo Dec 07 '12 at 22:15

Hamlet Hakobyan · Answer 5 · 2012-12-08T06:23:58.213

0

cos(x) = sqrt(1 - sin^2(x))
tan(x) = sin(x)/cos(x)

Sin(x) = x -x^3/3! + x^5/5! + (-1)^k*x^(2k+1)/(2k+1)! , k = 1, 2, ...

this is infinite function

edited Dec 08 '12 at 06:23

answered Dec 07 '12 at 21:55

Hamlet Hakobyan

32,965
6
52
68

I was more interested in constants. – Pablo Dec 07 '12 at 21:57

Yet another fast trigonometry

5 Answers5