-2

I am comparing numbers from 2 dictionaries (total comparisons ~ 1M). Here is a code snippet:

for i in dict1:
    val1 = dict[i]
    val2 = dict2[i]

    if (val1 != 0.000):
        perctg_diff = (val1 - val2)/val1 * 100
        if perctg_diff > 3.0:
            dict3.update({i:(val1,val2,perctg_diff)})
    if (val2 !=0.000):
        perctg_diff = (val2 - val1)/val2 * 100
        if perctg_diff > 3.0:
            dict3.update({i:(val1,val2,perctg_diff)})

I am finding percentage difference and writing the difference when more than 3% in dict3. After execution of script, I found some of the numbers in dict3 are

(1052712, (2.88541545330242e-33, 2.3194405728563e-27, 99.9998755986471))
(1052713, (8.1367737331018e-34, 7.83224080670401e-31, 99.8961118033279))
(1052715, (1.79168848952333e-33, 6.71766997709614e-31, 99.733287211841))
(1052717, (1.03397638198887e-25, 4.49948480152819e-26, 56.4836791255002))
(1400879, (0.0, 1.39114642689358e-36, 100.0))
(1290291, (0.0, 1.89369462623834e-20, 100.0))

What is effective/efficient way I can get rid of the numerical roundoff and ignore the comparison when numbers are these small?

(Using python 2.7 with numpy)

bn4365
  • 31
  • 1
  • 6
  • I don't understand how this is related to numpy? – roganjosh Jun 01 '18 at 17:24
  • Maybe [`round`](https://docs.python.org/3/library/functions.html#round) the numbers? – zvone Jun 01 '18 at 17:25
  • i don't understand but may be `decimal` module in python or `numpy.nextafter()` can help. – Nandish Patel Jun 01 '18 at 17:29
  • I found this may be it can help you https://stackoverflow.com/questions/10555659/python-arithmetic-with-small-numbers – Nandish Patel Jun 01 '18 at 17:30
  • What do you mean by "small"? These percentage differences are correct, there is no numerical roundoff. (The difference in the exponents is 2 or more, so it is more than 100 times smaller, so the percentage difference is greater than 99%.) – Artyer Jun 01 '18 at 17:36

1 Answers1

1

numpy.isclose is close (pun intended) to what you want. I evaluates the formula:

absolute(a - b) <= (atol + rtol * absolute(b))

So you could use that to filter your data, with atol the smallest absolute difference you want to still consider and rtol set at 3%.

Paul Panzer
  • 51,835
  • 3
  • 54
  • 99