5

I understand that it is not possible to represent all numbers to arbitrary precision with a finite number of bits, and that naive comparison of floating-point numbers is inadvisable. But I would expect that if I was adding many numbers together, the ** order ** in which I add them does not matter.

To test this prediction, I create a vector of random numbers and calculate their sum, then sort the vector and calculate the sum again. Very often, the two sums don't match! Is this a problem with my code (included below), a shortcoming of floating-point arithmetic in general, or an issue that might be resolved by switching compilers, etc.?

#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <random>
#include <vector>

double check_sum_depends_on_order(int seed)
{
    // fill a vector with random numbers
    std::vector<long double> v;
    std::uniform_real_distribution<long double> unif(-1.,1.);
    std::mt19937 rng(seed);
    for (size_t i = 0; i < 1000; ++i)
    {
        v.push_back(unif(rng));
    }

    // copy this vector and then shuffle it
    std::vector<long double> v2 = v;
    std::sort(v2.begin(), v2.end());


    // tot is running total for vector v, unsorted
    // tot2 is running total for vector v2, sorted
    long double tot = 0.0, tot2 = 0.0;
    for (size_t i = 0; i < v.size(); ++i)
    {
        tot += v[i];
        tot2 += v2[i];
    }

    // display result
    // you can comment this if you do not want verbose output
    printf("v tot\t= %.64Lf\n", tot);
    printf("v2 tot\t= %.64Lf\n", tot2);
    printf("Do the sums match (0/1)? %d\n\n", tot==tot2);

    // return 1.0 if the sums match, and 0.0 if they do not match
    return double(tot==tot2);
}

int main()
{
    // number of trials
    size_t N = 1000;

    // running total of number of matches
    double match = 0.;
    for (size_t i = 0; i < N; ++i)
    {
        // seed for random number generation
        int seed = time(NULL)*i;
        match += check_sum_depends_on_order(seed);
    }

    printf("%f percent of random samples have matching sums after sorting.", match/double(N)*100.);
    return 0;
}
GnomeSort
  • 275
  • 3
  • 12
  • "But I would expect that if I was adding many numbers together, the ** order ** in which I add them does not matter." - I don't see why you would think that. –  Feb 23 '18 at 23:56
  • 1
    As accuracy is *potentially* lost, every time you add two numbers together I would not expect these to be exactly the same value. Accuracy is lost in differing amounts - depending on the values being added. Different order would imply different losses in accuracy in different places along the way. – Galik Feb 23 '18 at 23:56
  • Here is a simpler example: https://ideone.com/J9SSdT – Robᵩ Feb 23 '18 at 23:58
  • *But I would expect that if I was adding many numbers together, the order in which I add them does not matter.* -- Now you know why it is stated time and again not to compare floating point values for equality. There are other scenarios other than addition that will surprise you when you see the values are not equal. – PaulMcKenzie Feb 24 '18 at 00:00
  • 1
    I don't think that's a duplicate but it does provide one *potential* reason why this doesn't work the way OP expected. – Galik Feb 24 '18 at 00:03
  • Sigh. This was marked as a duplicate of a question that asked if floating-point addition is commutative. But this question asks about associativity. So I reopened it. Folks, if you cannot tell the difference, please do not vote on floating-point questions. I reviewed a few potential originals, and none of them answer the question well, but I am open to candidates for originals. – Eric Postpischil Feb 24 '18 at 00:40
  • somehow related to https://stackoverflow.com/questions/47296419/result-of-the-sum-of-random-ordered-ieee-754-double-precision-floats/47346807#47346807 if not duplicate – aka.nice Feb 25 '18 at 09:41

1 Answers1

19

Suppose you have a decimal floating-point type with three digits of precision. Not very realistic, but it makes for a simpler example.

Suppose you have three variables, a, b and c. Suppose a is 1000, b and c are both 14.

a + b would be 1014, rounded down to 1010. (a + b) + c would then be 1024, rounded down to 1020.

b + c would be 28. a + (b + c) would then be 1028, rounded up to 1030.