0

I'm adding (summing) and array of floats in perl, and I was trying to speed it up. When I tried, I started getting weird results.

#!/usr/bin/perl

my $total = 0;
my $sum = 0;

# Compute $sum (adds from index 0 forward)
my @y = @{$$self{"closing"}}[-$periods..-1];
my @x = map {${$_}{$what}} @y;
# map { $sum += $_ } @x;
$sum += $_ for @x;

# Compute $total (adds from index -1 backward)

for (my $i = -1; $i >= -$periods; $i--) {
    $total += ${${$$self{"closing"}}[$i]}{$what};
}

if($total != $sum) {
    printf("SMA($what, $periods) total ($total) != sum ($sum) %g\n",
       ($total - $sum));
}

# Example output:
#    SMA(close, 20) total (941.03) != sum (941.03) -2.27374e-13

I seem to get different answers when I compute $sum and $total.

The only thing I can think of is that one method adds forward through the array, and the other backward.

Would this cause them to overflow differently? I would expect so, but it never occurred to me that I would get different answers. Notice that the difference is small (-2.27374e-13).

Is this what's going on, or is my code busted?

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

Erik Bennett
  • 1,049
  • 6
  • 15
  • 2
    `map { $sum += $_ } @x` is better written as `$sum += $_ for @x`. You shouldn't use `map` as a loop, it's there to transform one list into another. – Dave Cross Jun 19 '19 at 13:46
  • Fixed. It didn't change the math. – Erik Bennett Jun 19 '19 at 13:55
  • 5
    Floating-point arithmetic is not associative. Among other aspects, if you accumulate large values first, there is not enough precision to add small values accurately. But if you add small values first, they can be summed accurately, with large values added later. – Eric Postpischil Jun 19 '19 at 13:58
  • "Floating-point arithmetic is not associative." That was the answer I was looking for. It makes sense the way you've described it. In this particular case, the floats come in on a time interval, but I'll remember about the small values first in the future. – Erik Bennett Jun 19 '19 at 14:01
  • @ErikBennett: I didn't claim that it would fix your problem. It was just a bit of "best practices" advice. – Dave Cross Jun 19 '19 at 14:08
  • 1
    @DaveCross And I appreciate it! It's what I meant to do. The `map` thing was sloppy. – Erik Bennett Jun 19 '19 at 14:10
  • 1
    Very close question: https://stackoverflow.com/q/33974176/1289675 – Netch Jun 28 '19 at 15:16

1 Answers1

1

As Eric mentioned in the comments, floating point arithmetic is not associative; so the order you do the operations will impact the answer.

While "add smaller values first" is good advice, it is important to emphasize that you can have differences even with just regular "small" values. Here's one example:

  x  =  1.004028
  y  = 3.0039678
  z  =  4.000855

If these are taken to be IEEE-754 single-precision floats (i.e., 32-bit binary format), then we get:

  x + (y+z) = 8.008851
  (x+y) + z = 8.00885

Infinitely precise result is 8.0088508. So neither are very good! And the error isn't insignificant for scientific computations and it accumulates.

This is a rich field with many numerical algorithms to ensure precision. While which one you pick entirely depends on your problem domain and particular needs and resources you have available, one of the best-known algorithms is Kahan's summation algorithm, see: https://en.wikipedia.org/wiki/Kahan_summation_algorithm. You can easily adopt it to your problem for (hopefully) better results.

alias
  • 28,120
  • 2
  • 23
  • 40