2

I am writing a Python2 module that emulates a certain library. The results may be float, int, long, unicode, str, tuple, list, and custom objects. Lists may not contain lists, but they may contain tuples. Tuples may not contain lists or tuples. Otherwise, lists and tuples may contain any of the other types listed above.

(Actually, the module should not return long or str, but if it does, they should be caught and reported as different when compared to int and unicode, respectively.)

I am writing a testing program that checks the results against known answers by the library my module tries to emulate. The obvious answer would be to test the values and the types, but one problem I'm facing is that in corner cases, possible results to test for are -0.0 (which should be distinguished from 0.0) and NaN (Not a Number - a value a float can take).

However:

>>> a = float('nan')
>>> b = float('nan')
>>> a == b
False
>>> c = float('-0.0')
>>> c
-0.0
>>> d = 1.0 - 1.0
>>> c == d
True

The is operator doesn't help a bit:

>>> a is b
False
>>> d is 0.0
False

repr helps:

>>> repr(a) == repr(b)
True
>>> repr(c) == repr(d)
False
>>> repr(d) == repr(0.0)
True

But only to a point, since it doesn't help with objects:

>>> class e:
...   pass
... 
>>> f = e()
>>> g = e()
>>> f.x = float('nan')
>>> g.x = float('nan')
>>> f == g
False
>>> repr(f) == repr(g)
False

This works though:

>>> repr(f.__dict__) == repr(g.__dict__)
True

But it fails with tuples and lists:

>>> h = [float('nan'), f]
>>> i = [float('nan'), g]
>>> h == i
False
>>> repr(h) == repr(i)
False
>>> repr(h.__dict__) == repr(i.__dict__)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute '__dict__'

It seems I'm close, so I need to know:

  1. Is there a simpler way to check for actual equality that doesn't have the burden of converting to string?
  2. If not, how would I go about comparing lists or tuples containing objects?

Edit: To be clear, what I'm after is a full comparison function. My test function looks roughly like this:

>>> def test(expression, expected):
...   actual = eval(expression)
...   if not reallyequal(actual, expected):
...     report_error(expression, actual, expected)

My question concerns what should reallyequal() look like.

Edit 2: I've found the Python standard module unittest but unfortunately none of the checks covers this use case, so it seems that if I intend to use it, I should use something like self.assertTrue(reallyequal(actual, expected)).

I'm actually surprised that it's so hard to make unit tests including expected NaNs and minus zeros nested within the results. I'm still using the repr solution which is a half-solution, but I'm open to other ideas.

Pedro Gimeno
  • 2,837
  • 1
  • 25
  • 33
  • 1
    To cherry-pick an example, `NaN` is *guaranteed* to not compare equal to any other value, *including `NaN`* - use `math.isnan`. Custom objects can implement `__eq__` however they like (or not at all, as in your example `e`). Comparing `repr`esentations of `dict`ionaries seems odd - why not compare them directly? – jonrsharpe Jun 17 '14 at 16:04
  • 1
    You have several issues here, but one is answered by [How to check for NaN in python?](http://stackoverflow.com/q/944700) – Martijn Pieters Jun 17 '14 at 16:07
  • I know I can use math.isnan to know if it's a NaN, but what I'm after is a complete comparison function that tells me if two values are equal, for test purposes, and not individual values. I'll see if I can edit my question to leave that clear. – Pedro Gimeno Jun 17 '14 at 16:14
  • @PedroGimeno: That's just a recursive function; I thought your real problem was figuring out how to detect NaN and negative zero. The latter can only be tested for with strings, really: `if result == 0: return result == expected and str(result) == str(expected)` only returns `True` if both values are 0 and have the same sign. – Martijn Pieters Jun 17 '14 at 16:42
  • @jonrsharpe: {'x':float('nan')}=={'x':float('nan')} returns False, that's why. – Pedro Gimeno Jun 17 '14 at 17:38
  • @PedroGimeno if your classes have `NaN` attributes, that should be dealt with in their `__eq__` method, not your code – jonrsharpe Jun 17 '14 at 17:44
  • Modifying my classes' `__eq__` method for the sake of the test program interferes with usage of == within the module under test. NaN should compare different to NaN within the module, and equal within the test program. – Pedro Gimeno Jun 17 '14 at 18:18

2 Answers2

1

Here is one implementation:

def really_equal(actual, expected, tolerance=0.0001):
    """Compare actual and expected for 'actual' equality."""

    # 1. Both same type?
    if not isinstance(actual, type(expected)):
        return False

    # 2. Deal with floats (edge cases, tolerance)
    if isinstance(actual, float):
        if actual == 0.0:
            return str(actual) == str(expected)
        elif math.isnan(actual):
            return math.isnan(expected)
        return abs(actual - expected) < tolerance

    # 3. Deal with tuples and lists (item-by-item, recursively)
    if isinstance(actual, (tuple, list)):
        return all(really_equal(i1, i2) for i1, i2 in zip(actual, expected))

    # 4. Fall back to 'classic' equality
    return actual == expected

A few of your edge cases from "classic" equality:

>>> float('nan') == float('nan')
False
>>> really_equal(float('nan'), float('nan'))
True

>>> 0.0 == -0.0
True
>>> really_equal(0.0, -0.0)
False

>>> "foo" == u"foo"
True
>>> really_equal("foo", u"foo")
False

>>> 1L == 1
True
>>> really_equal(1L, 1)
False

Classes should implement their own __eq__ "magic method" to determine whether or not two instances are equal - they will fall through to # 4 and be compared there:

>>> class Test(object):

    def __init__(self, val):
        self.val = val

    def __eq__(self, other):
        return self.val == other.val


>>> a = Test(1)
>>> b = Test(1)
>>> really_equal(a, b)
True
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • Two bits of pickiness. First, an instance of a derived class should not compare equal to an instance of the parent class, so in my case it needs to compare types. Second, in most cases the floats should be compared for equality, not within a tolerance (e.g. 1e-300 should not equal 0.0). The solution to the latter is to use `<=` instead of `<` and have a default tolerance of 0.0. Besides, as indicated in a comment above, overriding `__eq__` interferes with normal usage of `==`. – Pedro Gimeno Jun 17 '14 at 18:31
  • 1
    @PedroGimeno way to look the gift horse in the mouth. You want a spec met, go to RentACoder. – jonrsharpe Jun 17 '14 at 18:43
  • not really, your answer doesn't do better than comparing the repr() and is far more complex in comparison, so I don't think it addresses any of my two questions. I was pointing out where it fails to work like `repr()` which was closer to what I needed. – Pedro Gimeno Jun 17 '14 at 22:20
  • I alleviated *"the burden of converting to string"*, and dealt with lists and tuples, per your requests. You will probably have problems with floats; there is a reason a tolerance is used. It is still not clear what you're trying to do, but without the information to solve the bigger question I can only do so much. – jonrsharpe Jun 17 '14 at 22:30
0

From the answers and comments it seems clear that the answer to my first question (is there a simpler way than using repr()?) is no, there is no simpler way. So I've researched more on how to accomplish this as simply as possible and I've come up with this solution which answers my second question.

repr() works for the most part, but fails on objects of custom classes. Since the default repr() of a custom object is not useful as-is anyway for any meaningful purpose, what I've done is to override the __repr__ method of each base class like this:

class MyClass:
    def __repr__(self):
        return self.__class__.__name__ + "(" \
            + repr(sorted(self.__dict__.items(), key=lambda t: t[0])) + ")"

Now I can use repr() on any of the values and get an expression that actually represents these values uniquely, that my test program can catch.

def reallyequal(actual, expected):
    return repr(actual) == repr(expected)

(which I will actually embed in the test function due to its simplicity).

Here it is in action:

>>> reallyequal(-0.0, 0.0)
False
>>> reallyequal(float('nan'),float('nan'))
True
>>> f = MyClass()
>>> f.x = float('nan')
>>> g = MyClass()
>>> g.x = float('nan')
>>> reallyequal(f, g)
True
>>> h = [f,3]
>>> i = [g,4]
>>> reallyequal(h, i)
False
>>> i[1] = 3
>>> reallyequal(h, i)
True
>>> g.x = 1
>>> reallyequal(h, i)
False
>>> f.x = 1L
>>> reallyequal(h, i)
False
>>> f.x = 1
>>> reallyequal(h, i)
True

Edit: Edited to incorporate commenter's suggestions re repr results with __dict__.

Pedro Gimeno
  • 2,837
  • 1
  • 25
  • 33
  • 1
    If you're going down this route, checking they "look the same", at least write a [real `__repr__`](https://docs.python.org/2/reference/datamodel.html#object.__repr__), i.e. *"a valid Python expression that could be used to recreate an object with the same value"*. – jonrsharpe Jun 17 '14 at 22:24
  • @jonrsharpe Edited to do so. Given an appropriate environment (a constructor accepting a dict) it would reconstruct the object. The idea is similar to `repr(float('nan'))` returning `'nan'`, which is an expression that gives an error or a wrong value without the appropriate environment (a global called `nan` and defined to be NaN which does not exist by default), so I guess it qualifies. – Pedro Gimeno Jun 17 '14 at 22:45
  • Typically you would make the `__repr__` work with *the actual `__init__`*, rather than some imagined magic constructor; `float('nan')` is apparently an exception. – jonrsharpe Jun 17 '14 at 22:59
  • 1
    The exception was considered a bug - see [here](http://bugs.python.org/issue1732212). Writing a better repr will avoid issues with dictionary ordering (or lack thereof). – jonrsharpe Jun 17 '14 at 23:15
  • @jonrsharpe Yes, you have a point. [repr of dictionaries is not unique.](http://stackoverflow.com/questions/1604281/do-dictionaries-in-python-have-a-single-repr-value) – Pedro Gimeno Jun 18 '14 at 01:46