In general, in a sequence of floating-point operations that also involves contracting operations such as square root or cube root, it is advantageous from an accuracy perspective to perform the contracting operations last. For example, sqrt(1.0/x)
is more accurate than 1.0/sqrt(x)
, sqrt(a*b)
is more accurate than sqrt(a)*sqrt(b)
, and cbrt(a*b*c)
is more accurate than cbrt(a)*cbrt(b)*cbrt(c)
.
As a consequence, unless there is a danger of overflowing or underflowing the chosen floating-point format, such as IEEE-754 binary64
(e.g. double
in C/C++), in intermediate computation, method [2] should be chosen. Additional aspect relevant to accuracy: if n-th root is computed by exponentiation, such as pow()
in C/C++, additional error will be introduced with every computed root, as explained in case of cube root in my answer to this question. Finally, the computation of the n-th root will be slower than a multiplication, so doing only multiplies with a final root computation at the end will also be a superior approach performancewise.
Very accurate results can be achieved with method [2] by using a compensated product (akin to the compensated addition provided by Kahan summation). See the following paper for details:
Stef Graillat, "Accurate Floating-Point Product and Exponentiation", IEEE Transactions on Computers, Vol. 58, No. 7, July 2009, pp. 994-1000 (online)
This compensated product can be computed particularly efficient on systems that provide the FMA (fused multiply-add) operation in hardware. This is the case for all the common modern processor architectures, both CPUs and GPUs. C/C++ provide convenient access to this via the standard math functions fma()
, fmaf()
.
Update: Asker clarified in comment that the risk of underflow is imminent since there are on the order of 108 factors in [10-6, 10-1]. One possible workaround mentioned by @Yves Daoust in a comment is to separate the factors into mantissa and exponent and accumulate them separately. Whether this is practical will depend on the floating-point environment. While C and C++ provide the standard function frexp()
for performing this splitting, this function may not be very fast.