0

I'm running a numerical simulation and get divergent results on different platforms because of machine precision issues. As a simple example (that I don't think actually fails, but could):

import numpy as np
np.random.seed(seed=42)
vals = np.random.rand(int(1e5))
threshold = 0.5
good_vals = np.where(vals > threshold)

Even though I've seeded the random number generator, there can be values very near the threshold that might end up evaluated above the threshold on one system and not another because of differing machine precisions. Is there a standard way to deal with this?

I.P. Freeley
  • 885
  • 1
  • 9
  • 19
  • Use explicit dtypes instead if relying on system-dependent defaults? – juanpa.arrivillaga Oct 03 '19 at 20:50
  • It sets to float64 on both systems, so I think I would need to define a new float-like dtype to force identical calculations. – I.P. Freeley Oct 03 '19 at 20:58
  • Sorry, what do you mean? I don't see how your premise leads to your conclusion. – juanpa.arrivillaga Oct 03 '19 at 20:58
  • They are already the same dtype. I don't think explicitly setting the dtype will change the behavior? Unless there's a dtype that I don't know about that forces the same precision cross-platform. – I.P. Freeley Oct 03 '19 at 21:04
  • Are you sure it is just numerical precision? random is not guaranteed to be reproducible (see https://stackoverflow.com/questions/8786084/reproducibility-of-python-pseudo-random-numbers-across-systems-and-versions) By the way what are your two platforms ? Linux/win/mac? Do you also change python version? – Demi-Lune Oct 04 '19 at 06:37
  • It's at least partially machine precision. The sims generate ~1 Gb of output, and there are differences at the 1e-13 level in lots of columns. I've been using my posted solution to keep things from diverging for longer and longer, but simulations still eventually diverge. Looks like I've got python 3.6.2 on my mac and 3.7.3 on linux. I think I've also got different versions of astropy, so I'll try to make sure I've got all versions matching next. – I.P. Freeley Oct 04 '19 at 16:52

1 Answers1

0

One solution is to force the comparison to be done on rounded integers. I can make a new class

class int_rounded(object):
    def __init__(self, inval, scale=1e5):
        self.initial = inval
        self.value = np.round(inval * scale).astype(int)
        self.scale = scale
    def __gt__(self, other):
        return self.value > other.value
    def __lt__(self, other):
        return self.value < other.value

Then I can do the comparison

good_vals = np.where(int_rounded(vals) > int_rounded(threshold))

And this seems to force results to be identical cross-platform, but I suspect there are some failure modes I haven't hit.

I.P. Freeley
  • 885
  • 1
  • 9
  • 19
  • One obvious problem is that if one is comparing floats that are much larger than the maximum size of ints. So there should probably be some more checking, but since this works and no one has come up with anything better I'm accepting it. – I.P. Freeley Oct 09 '19 at 18:29