How is arctan implemented?

Question

Many implementation of the library goes deep down to FPATAN instuction for all arc-functions. How is FPATAN implemented? Assuming that we have 1 bit sign, M bits mantissa and N bits exponent, what is the algorithm to get the arctangent of this number? There should be such algorithm, since the FPU does it.

njuffa · Answer 1 · 2022-09-10T23:27:35.220

Implementations of the FPATAN instructions in x86 processors are usually proprietary. To compute arctan, or other (inverse) trigonometric functions, common algorithms follow a three-step process:

argument reduction for mapping the full input domain to a narrow interval
computation of the core approximation on the narrow interval (primary approximation interval)
expansion of the intermediate result based on the argument reduction to produce the final result

The argument reduction is usually based on well-known trigonometric identities that can be looked up in various standard references such as MathWorld. For the computation of arctan, commonly used identities are

arctan (-x) = -arctan(x)
arctan (1/x) = 0.5 * pi - arctan(x) [x > 0]
arctan (x) = arctan(c) + arctan((x - c) / (1 + x*c))

Note that the last identity lends itself to the construction of a table of values arctan(i/2ⁿ), i = 1...2ⁿ, which allows the use of an arbitrarily narrow primary approximation interval at the expense of additional table storage. This is a classical programming trade-off between space and time.

The approximation on the core interval is typically a minimax polynomial approximation of sufficient degree. Rational approximations are usually not competitive on modern hardware due to the high cost of floating-point division, and also suffer from additional numerical error, due to the computation of two polynomials plus the error contributed by the division.

The coefficients for minimax polynomial approximations are usually computed using the Remez algorithm. Tools like Maple and Mathematica have built-in facilities to compute such approximations. The accuracy of polynomial approximations can be improved by making sure all coefficients are exactly representable machine numbers. The only tool I am aware of that has a built-in facility for this is Sollya which offers an fpminimax() function.

The evaluation of polynomials usually utilizes Horner's scheme which is efficient and accurate, or a mixture of Estrin's scheme and Horner's. Estrin's scheme allows one to make excellent use of instruction level parallelism provided by superscalar processors, with a minor impact on overall instruction count and often (but not always) benign impact on accuracy.

The use of FMA (fused-multiply add) enhances the accuracy and performance of either evaluation scheme due to the reduced number of rounding steps and by offering some protection against subtractive cancellation. FMA is found on many processors, including GPUs and recent ARM, x86, and Power CPUs. In standard C and standard C++, the FMA operation is exposed as the fma() standard library function, however it needs to be emulated on platforms that do not offer hardware support, which makes it slow on those platforms.

From a programming viewpoint one would like to avoid the risk of conversion errors when translating the floating-point constants needed for the approximation and argument reduction from textual to machine representation. ASCII-to-floating-point conversion routine are notorious for containing tricky bugs. One mechanism offered by standard C (not C++ best I know, where it is available only as a proprietary extension) is to specify floating-point constants as hexadecimal literals that directly express the underlying bit-pattern, effectively avoiding complicated conversions.

Below is C code to compute double-precision arctan() that demonstrates many of the design principles and techniques mentioned above. This quickly-constructed code lacks the sophistication of the implementations pointed to in other answers but provides an implementation with a maximum observed error of 1.65 ulps, which may be sufficient in various contexts. I created a custom minimax approximation with a simple implementation of the Remez algorithm that used 1024-bit floating-point arithmetic for all intermediate steps. I would expect the use of Sollya or similar tools to result in a numerically superior approximations. To balance performance and accuracy, the evaluation of the core approximation combines both second-order and ordinary Horner schemes.

/* Compute arctangent. Maximum observed error: 1.64991 ulps */
double my_atan (double x)
{
    double a, z, p, r, q, s, t;
    /* argument reduction: 
       arctan (-x) = -arctan(x); 
       arctan (1/x) = 1/2 * pi - arctan (x), when x > 0
    */
    z = fabs (x);
    a = (z > 1.0) ? (1.0 / z) : z;
    s = a * a;
    q = s * s;
    /* core approximation: approximate atan(x) on [0,1] */
    p =            -2.0258553044340116e-5;  // -0x1.53e1d2a258e3ap-16
    t =             2.2302240345710764e-4;  //  0x1.d3b63dbb6167ap-13
    p = fma (p, q, -1.1640717779912220e-3); // -0x1.312788ddde71dp-10
    t = fma (t, q,  3.8559749383656407e-3); //  0x1.f9690c824aaf1p-9
    p = fma (p, q, -9.1845592187222193e-3); // -0x1.2cf5aabc7dbd2p-7
    t = fma (t, q,  1.6978035834594660e-2); //  0x1.162b0b2a3bcdcp-6
    p = fma (p, q, -2.5826796814492296e-2); // -0x1.a7256feb6f841p-6
    t = fma (t, q,  3.4067811082715810e-2); //  0x1.171560ce4a4ecp-5
    p = fma (p, q, -4.0926382420509999e-2); // -0x1.4f44d841450e8p-5
    t = fma (t, q,  4.6739496199158334e-2); //  0x1.7ee3d3f36bbc6p-5
    p = fma (p, q, -5.2392330054601366e-2); // -0x1.ad32ae04a9fd8p-5
    t = fma (t, q,  5.8773077721790683e-2); //  0x1.e17813d669537p-5
    p = fma (p, q, -6.6658603633512892e-2); // -0x1.11089ca9a5be4p-4
    t = fma (t, q,  7.6922129305867892e-2); //  0x1.3b12b2db5173cp-4
    p = fma (p, s, t);
    p = fma (p, s, -9.0909012354005267e-2); // -0x1.745d022f8dc5fp-4
    p = fma (p, s,  1.1111110678749421e-1); //  0x1.c71c709dfe925p-4
    p = fma (p, s, -1.4285714271334810e-1); // -0x1.2492491fa1742p-3
    p = fma (p, s,  1.9999999999755005e-1); //  0x1.99999999840cdp-3
    p = fma (p, s, -3.3333333333331838e-1); // -0x1.5555555555448p-2
    p = fma (p * s, a, a);
    /* back substitution in accordance with argument reduction */
    /* double-precision factorization of PI/2 courtesy of Tor Myklebust */
    r = (z > 1.0) ? fma (0.93282184640716537, 1.6839188885261840, -p) : p;
    return copysign (r, x);
}

Out of curiosity, are there any cases where using radians for trigonometric calculations allows much better precision than would be achievable using an integer number of subdivisions? Certainly, modulus reduction would be easier and more precise using angles measured in degrees, quadrants, or whole circles. I know why radians are useful in calculus, but having the number of angular units for a full circle not be precisely representable seems rather icky. — supercat, Apr 16 '14 at 19:57
Some platforms offer `sinpi()` and `cospi()` functions which accept arguments that are multiples of pi, which makes argument reduction easy. Otherwise, accurate argument reduction for sin, cos, tan is hard and essentially requires multi-precision intermediate computation regardless whether radians or degrees are used. The canonical reference is: Mary H. Payne and Robert N. Hanek, Radian Reduction for Trigonometric Functions, ACM SIGNUM Newsletter, vol. 18, no. 1, Jan. 1983, pp. 19 - 24 — njuffa, Apr 16 '14 at 22:06
The companion paper for degree argument reduction is: Mary H. Payne and Robert N. Hanek, Degree reduction for trigonometric functions, ACM SIGNUM Newsletter, vol. 18. no. 2, Apr. 1983, pp. 18 - 19 — njuffa, Apr 16 '14 at 22:08
Why would multi-precision reduction be required in the degrees case? To be sure, it's easier in the multiple-of-pi case, but fpmod(x, 360.0) is specified to be absolutely precise for all values of x, is it not? Incidentally, I'm not sure how useful hyper-precise argument reduction is when using radians; if one is trying to compute sin(2πx) using `Math.Sin(x*2.0*Math.Pi)`, the result would be more accurate if argument reduction is performed modulo `2.0*Math.Pi` than if it's performed modulo 2π. — supercat, Apr 16 '14 at 22:51
I may have gotten confused about the degree reduction (in a bit of a hurry today). I have never had the need to implement it and consequently have not thought about it. What you say seems to make sense: if a native-precision IEEE remainder operation is available, that should be all that is needed for an accurate reduction. IMHO, the best solution for computation of terms like sin(2πx) is to offer a `sinpi()` function so programmers can write `sinpi(2*x)` and get a result that is as consistent with mathematical behavior as possible. Use of machine PI introduces phase error. — njuffa, Apr 16 '14 at 23:13
The techniques I'm familiar with for computing trig functions entail starting out by converting an angle to a power-of-two fraction of a circle; do you know any techniques that don't? If not, do you have any idea why `sinpi` etc. functions shouldn't universally available? Having the programmer scale a value up by a factor of 2pi so the processor can scale it down by a factor of pi seems crazy. — supercat, Apr 17 '14 at 14:32
We are getting way off topic (Stackoverflow isn't designed for discussions). Standards typically codify existing usage. IEEE-754 mentions `sinpi` etc as recommended functionality, some C/C++ toolchains offer it as an extension, and GPU programming environments like CUDA and OpenCL include it. So if programmers keep using and requesting it, I would expect it to be a standard library function a couple decades down the line. — njuffa, Apr 17 '14 at 16:41
Disagree with "accurate argument reduction for sin, cos, tan is hard ... regardless ... or _degrees_ are used. `reduced_degrees = fmod(raw_degrees, 360.0)` is a direct easy range reduction. Ref http://stackoverflow.com/questions/20928253/is-fmod-exact-when-y-is-an-integer — chux - Reinstate Monica, Mar 02 '17 at 23:21
@chux I agree that trig function argument reduction by degree is easy. Unfortunately there is no way to correct a comment (other than during the grace period) when one mis-spoke. I would suggest `remquo (angle,90.0)` instead of `fmod()`, though. — njuffa, Mar 03 '17 at 00:50
Agree that `remquo()` is even better - though I think that is a C99 add-on, I used `remquo()` successfully [here](http://stackoverflow.com/a/31525208/2410359) for an improved `sind()` — chux - Reinstate Monica, Mar 03 '17 at 01:12

typ1232 · Accepted Answer · 2014-04-13T20:48:11.140

9

Trigonometric functions do have pretty ugly implementations that are hacky and do lots of bit fiddling. I think it will be pretty hard to find someone here that is able to explain an algorithm that is actually used.

Here is an atan2 implementation: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/ieee754/dbl-64/e_atan2.c;h=a287ca6656b210c77367eec3c46d72f18476d61d;hb=HEAD

Edit: Actually I found this one: http://www.netlib.org/fdlibm/e_atan2.c which is a lot easier to follow, but probably slower because of that (?).

The FPU does all this in some circuits so the CPU doesn't have to do all this work.

edited Apr 13 '14 at 20:48

answered Apr 13 '14 at 20:38

typ1232

5,535
6
35
51

Thanks a lot. On the first link it also includes mpatan.h and mpatan.c where there is an implementation of atan - exactly what I was looking for. – Plamen Dragiyski Apr 13 '14 at 21:08
1

not all FPUs do this in hardware. There may be some architecture that doesn't have trigonometric instructions. SSE doesn't support trigonometric too, so MSVC 2013 must implement a software one when vectorizing code – phuclv Apr 16 '14 at 02:56
3

The FPATAN instruction in x86 CPUs is typically implemented via microcode, that is, a small program stored in an internal ROM inside the processor. While such programs may use specialized operations not available in the visible ISA, there is usually no special circuitry involved. – njuffa Apr 16 '14 at 06:11
The [second implementation of `atan2`](http://www.netlib.org/fdlibm/e_atan2.c) is a lot shorter because it uses `atan`. – lrineau Apr 16 '14 at 10:44

score 8 · Answer 3 · answered Apr 13 '14 at 22:22

Summary: It's hard. Also, Eric Postpischil and Stephen Canon, who sometimes hang around SO, are very good at it.

The usual approach for many special functions is as follows:

Handle NaNs, infinities, and signed zeros as special cases.
If the number is so large that the result rounds to M_PI, return M_PI. Call this threshold M.
If there's any sort of argument-reduction identity, use it to bring the argument into a nicer range. (This can be tricky: For sin and cos, this means you pick off a multiple of the exact value of 2pi so that you land in the correct range.)
Break up [0,M) into finitely many intervals. Use a Chebyshev approximation to arctan of fairly high order on each interval. (This is done offline and it's usually the source of all the magic numbers you see in these implementations. Also, one can tighten up the Chebyshev approximation slightly using Remez's exchange algorithm, but I'm not aware of any cases where this helps a lot.)
Figure out which interval the argument is in (using ifs and stuff or just a trick with table indexing), and evaluate the Chebyshev series on that interval.

A few properties are particularly desirable here:

The arctan implementation should be monotonic; that is, if x < y, then arctan(x) <= arctan(y).
The arctan implementation should always return an answer within 1 ulp of the right answer. Note that this is a relative error bound.

It is not completely straightforward to evaluate a Chebyshev series so that these two properties hold. Tricks where two doubles are used to represent different parts of a single value are common here. Then there's probably some casework to show that the implementation is monotonic. Also, near zero, a Taylor approximation to arctan instead of a Chebyshev approximation---you're after a relative error bound and evaluating the series using Horner's rule ought to work.

If you're looking for an atan implementation to read, fdlibm's seems less nasty than the one currently in glibc. The argument reduction appears to be based on the trig identity tan(a+b) = (tan(a) + tan(b)) / (1 - tan(a) tan(b)), using 0.5, 1, or 1.5 for tan(a) as appropriate.

Since we are on the subject, and I should perhaps ask this in another question, a good reason to use a Padé approximant instead of a polynomial one is when the function to approximate, such as arctangent, tends towards a finite limit in +/-inf. Obviously, a polynomial approximation of degree more than 1 is never going to be any good there. Now the question I have is, since we are doing argument reduction anyway and the approximation is only ever used on, say [0 … 0.5], then the above reason (the only one I have ever heard given) should not matter so much, should it? — Pascal Cuoq, Apr 14 '14 at 06:30
@PascalCuoq: I'd expect a Chebyshev approximation of degree k and a Pade-Chebyshev approximation of total degree (numerator degree + denominator degree) k to be roughly equally good at approximating a well-behaved function on a compact interval. In the absence of such an argument-reduction scheme, I'd guess you'd need to get the difference of the degrees right. (I've only ever had to write low-quality implementations of special functions, so there might be subtler reasons to use a rational approximation instead of a polynomial approximation in some cases---I don't know.) — tmyklebu, Apr 14 '14 at 20:31
Rational approximations are rarely competitive. Floating-point division is much more expensive than FADD, FMUL, or FMA. Also, you have to deal with error from two polynomials plus the error from the division. In most cases you would want either straight polynomials, or table plus polynomial. In terms of polynomials, you would want coefficients optimized for the target precision, e.g. approximations provided by Sollya's `fpminimax()` function. If FMA is available, it will help keep the evaluation error small. Estrin's scheme can help with performance on superscalar architectures. — njuffa, Apr 14 '14 at 22:10

How is arctan implemented?

3 Answers3

Linked

Related