Transfer of approximation from double precision to single precision

Question

I would like to transfer a double precision approximation of a given function into a single precision C implementation (target device provides single precision ALU only).

It's not too complicated to generate a high precision (e.g. max error 0.1e-12) approximation using double precision. I have used maple minimax function, but I have as well found some implementations using double precision example.

But as soon as it comes to transferring this approximation to a single precision method, I face a loss of accuracy, when I simply convert the coefficients to float. I'm aiming an approximation (single precision) which is precise for about +/-5 ulp. Simply converting the coefficients to float doesnt seem to do the job. I already learned to split up constants like pi/2 into a rounded part and an error part, and I think there is some kind of trick to transfer the coefficients (the core computation of the approximations are usually polynomials, i want to focus on them in this question), that I don't know yet.

I am grateful for every hint, paper concerning the transfer aiming an implementation. I have already studied some papers about float precision, but for the last two weeks did not make to much progress.

Thanks in advance!

I don't know about a loss of *accuracy*, but surely you do face a loss of (relative) *precision* if you change to a a lower-precision representation. If you want the precision of `double`, then why in the world would you ever consider going anywhere near `float`? There is pretty much nothing but storage size to recommend the latter over the former, and storage-size reduction necessarily involves range and / or precision reduction. — John Bollinger, Sep 18 '20 at 14:30
A couple of links for you: [Error Propagation](https://floating-point-gui.de/errors/propagation/) (from [The Floating-Point Guide](https://floating-point-gui.de/)), and [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) (very technical). — Ian Abbott, Sep 18 '20 at 14:40
Can you quantify your "giant loss of accuracy"? Is it 1E-6, or 1E-3, or what? Since a `float` only supports about 7 digits accuracy, expecting much better than 1E-6 is probably not sensible (but the difference between 1E-13 and 1E-6 is "giant"). The double-precision calculations may do operations to ensure accuracy that are counter-productive in single-precision. It's a complex topic. — Jonathan Leffler, Sep 18 '20 at 14:55
@IanAbbott thanks for the links, I actually already studied two of them. at JohnBollinger, JonathanLeffler, I'll edit my post to explain my goal more precisely — Dexter S, Sep 18 '20 at 15:16
Are you sure that the using single-precision will provide enough benefit to warrant the effort you've taken so far? Would wrapping the double-precision code in a cast (or a function that does that) not provide you with what you need? If this is in the inner loop of an emulation/simulation that will run for years and using single-precision might save you time on each iteration, maybe it is worth the effort. But I'm a little dubious that the benefit outweighs the cost. — Jonathan Leffler, Sep 18 '20 at 15:22
@JonathanLeffler: I think that it's possible due to another [post](https://stackoverflow.com/questions/63918873/approximating-cosine-on-0-pi-using-only-single-precision-floating-point) , where i asked for an approximation of cosine in single precision. Trying to cast double precision coefficients did not work very well. But the marked answer by njuffa provided an approximation with the accuracy I want. Instead of asking for each specific function, I want to understand how he did that... — Dexter S, Sep 18 '20 at 15:45
@DexterS, it is possible to compute approximations to transcendental functions such as the cosine to arbitrary precision, if you write appropriate code and you're willing to wait long enough. But you cannot represent arbitrary-precision results in a `float` (nor in a `double`). For IEEE-754 single-precision binary format, 5 ulps is a relative error of about 3E-7. The best you could conceivably do with that format is 0.5 ulp, for 3E-8. — John Bollinger, Sep 18 '20 at 16:12
I emphasize that that's *relative* error. For function values close enough to zero, such relative errors can correspond to *absolute* errors in the range you describe, but I'm not sure that would actually achieve anything useful for you, and it's unclear to me whether that's what you're after, anyway. — John Bollinger, Sep 18 '20 at 16:18
@DexterS You mention Maple but have you tried the free [Sollya tool](http://sollya.gforge.inria.fr/), specifically its `fpminimax` command? When mathematically correct coefficients for a minimax approximation are rounded to a finite precision floating-point format, the minimax property of an approximation can be destroyed. This is a well-known issue that `fpminimax` tries to address. — njuffa, Sep 18 '20 at 19:20
@njuffa so your trick to calculate the coefficients was simply to use sollya fpminimax with finite precision? — Dexter S, Sep 18 '20 at 19:59
@DexterS No. I use my own tools. But I know from past experience that Sollya's `fpminimax` often gives very good results. — njuffa, Sep 18 '20 at 20:38
Show your code, or at least pseudo-code? How are you calculating the polynomial? What accuracy/error are you getting? — Eric Postpischil, Sep 19 '20 at 00:04
@JohnBollinger: Implementations of sine/cosine do not suffer absolute error problems after a correct reduction is done, because, for x near 0, sine x is near x, which is quite easy to approximate. — Eric Postpischil, Sep 19 '20 at 00:05
I didn't say otherwise, @EricPostpischil. I said that *`float` (or `double`) representations* approximating the results of such functions have a minimum absolute error that is itself a function of the true value. That minimum is nonzero where the true result is not exactly zero, and, for `float`, it is larger than the OP seems to want for a lot of target values. — John Bollinger, Sep 19 '20 at 15:31
@JohnBollinger: Your meaning is not clear. Please give an example. — Eric Postpischil, Sep 19 '20 at 17:25

score 2 · Answer 1 · answered Sep 18 '20 at 20:37

A common way to generate polynomial minimax approximation is to use the Remez exchange algorithm published by Russian mathematician Evgeny Remez in 1934. This is a numerical procedure that often involves ill-conditioned systems of equations. As a consequence, it is is usually implemented with the help of an arbitrary-precision library. For example, in the implementation of the Remez algorithm that I use I configured the library for 1024-bit precision.

For reasonably well-behaved functions, various variants of the Remez algorithm can find approximations very close to the mathematical minimax polynomial. The problem, as noted in the question, is what happens when one moves the generated coefficients of the polynomial to a finite-precision floating-point computation. One often finds the minimax property of the approximation impaired, sometimes significantly so. There are two sources of error at play. First, the generated coefficients cannot be represented accurately in the finite precision floating-point format. Second, the evaluation of the polynomial uses finite-precision operations instead of mathematical operations with infinite precision.

The first problem is the easier one to address. As one can see from some quick experiments, simply rounding the coefficients to the finite-precision format doesn't accomplish the desired near minimax result. By using a finite-precision format, we basically transform from an N-dimensional continuous space to an N-dimensional discrete lattice, and to do this properly, we need to find the closest lattice points. This is a solvable but hard problem, which is usually made easier through the use of heuristics. Relevant literature:

N. Brisebarre, J.-M. Muller, and A. Tisserand, "Computing machine-efficient polynomial approximations". ACM Transactions on Mathematical Software, Vol. 32. No. 2, June 2006, pp. 236-256. (online)

Nicolas Brisebarre and Sylvain Chevillard, "Efficient polynomial L^∞-approximations", In 18th IEEE Symposium on Computer Arithmetic, June 2007, pp. 169-176 (online)

Florent de Dinechin and Christoph Lauter, "Optimizing polynomials for floating-point implementation", ArXiv preprint 2008 (online)

The Sollya tool uses these techniques from the literature for its fpminimax command. Worth checking out in addition to Maple's and Mathematica's facilities for generating minimax polynomial approximations, as it often results in superior approximations in my experience.

The second problem, how to account for evaluation with finite-precision floating-point computation and how to adjust the coefficients of a polynomial approximation accordingly, are still subject to research. Some initial results:

Tor Myklebust, "Computing accurate Horner form approximations to special functions in finite precision arithmetic", ArXiv manuscript 2015 (online)

Denis Arzelier, Florent Bréhard, Mioara Joldes, "Exchange algorithm for evaluation and approximation error-optimized polynomials", In 26th IEEE Symposium on Computer Arithmetic, June 2019, pp. 30-37 (online)

Note that the first publication came about due to a question I asked on Stackoverflow.

For my own use I am using a heuristic search for finding approximations optimized to account for both representational error in the coefficients and evaluation error in the polynomial evaluation. it can be loosely described as a form of simulated annealing. I have also checked into the use of genetic programming, but the preliminary results did not look promising, so I stopped pursuing this approach.

Thanks for your answer and your detailed description. I will definetely check these papers within the next days. I assume that “your own“ tools arent published or free available.... I will test sollya fpminimax to generate an approximation and evaluate the results. — Dexter S, Sep 18 '20 at 21:13
@DexterS My own closed-source proprietary code for generating polynomial and rational approximations has been evolving in many small steps since the mid 1980s. The software quality is horrible from a modern software engineering perspective and so is the "user interface". The results are often quite competitive, though :-) — njuffa, Sep 18 '20 at 21:24
so if you were in my situation, and did not have your own tool: What would you recommend to get fairly accurate approximations of several functions? Would you use sollya? — Dexter S, Sep 20 '20 at 14:48
@DexterS I don't know your situation. In my experience, Sollya provides an excellent starting point if one needs to create polynomial approximations. If you anticipate working on this subject matter for years to come, you may ultimately want to create your own software. — njuffa, Sep 20 '20 at 17:30
I actually didn't manage to install and use sollya. I am not experienced with compilers and installing packages, and there is fairly no detailed manual for the installation. I might ask a new question aiming to approximate the sine function... Thank you for your help!! — Dexter S, Sep 22 '20 at 16:44

Transfer of approximation from double precision to single precision

1 Answers1