21

I want to assert that two Python dictionaries are equal (that means: equal amount of keys, and each mapping from key to value is equal; order is not important). A simple way would be assert A==B, however, this does not work if the values of the dictionaries are numpy arrays. How can I write a function to check in general if two dictionaries are equal?

>>> import numpy as np
>>> A = {1: np.identity(5)}
>>> B = {1: np.identity(5) + np.ones([5,5])}
>>> A == B
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

EDIT I am aware that numpy matrices shall be checked for equality with .all(). What I am looking for is a general way to check for this, without having to check isinstance(np.ndarray). Would this be possible?

Related topics without numpy arrays:

Community
  • 1
  • 1
physicalattraction
  • 6,485
  • 10
  • 63
  • 122

4 Answers4

26

You can use numpy.testing.assert_equal

http://docs.scipy.org/doc/numpy/reference/generated/numpy.testing.assert_equal.html

vitiral
  • 8,446
  • 8
  • 29
  • 43
  • This does not return a bool value. Rather, it throws an exception if the objects are not equal. It can be used to construct a **cmp** function, but by itself it is not one. – Guilherme de Lazari Apr 13 '18 at 14:14
  • @GuilhermedeLazari at some point you are splitting hairs here. Just make your cmp function using a try/except block. It's pretty much written itself. – eric Jan 28 '19 at 18:16
  • 3
    @GuilhermedeLazari The original question said "I want to assert that two Python dictionaries are equal" – vitiral Jan 29 '19 at 19:22
  • This answer only works if you *know upfront* that the values are numpy arrays. The question is to find a generic way without checking for instance type of the values first. – physicalattraction Sep 05 '19 at 19:08
  • It's been a while since I've used it, but the docs say "Given two objects (scalars, lists, tuples, dictionaries or numpy arrays), check that all elements of these objects are equal. An exception is raised at the first conflicting values." Seems like it should work for other types, but if you find that's not true it might be a bug. – vitiral Sep 09 '19 at 16:49
10

I'm going to answer the half-question hidden in your question's title and first half, because frankly, this is a much more common problem to be solved and the existing answers don't address it very well. This question is "How do I compare two dicts of numpy arrays for equality"?

The first part of the problem is checking the dicts "from afar": see that their keys are the same. If all the keys are the same, the second part is comparing each corresponding value.

Now the subtle issue is that a lot of numpy arrays are not integer-valued, and double-precision is imprecise. So unless you have integer-valued (or other non-float-like) arrays you will probably want to check that the values are almost the same, i.e. within machine precision. So in this case you wouldn't use np.array_equal (which checks exact numerical equality), but rather np.allclose (which uses a finite tolerance for the relative and absolute error between two arrays).

The first one and a half parts of the problem are straightforward: check that the keys of the dicts agree, and use a generator comprehension to compare every value (and use all outside the comprehension to verify that each item is the same):

import numpy as np

# some dummy data

# these are equal exactly
dct1 = {'a': np.array([2, 3, 4])}
dct2 = {'a': np.array([2, 3, 4])}

# these are equal _roughly_
dct3 = {'b': np.array([42.0, 0.2])}
dct4 = {'b': np.array([42.0, 3*0.1 - 0.1])}  # still 0.2, right?

def compare_exact(first, second):
    """Return whether two dicts of arrays are exactly equal"""
    if first.keys() != second.keys():
        return False
    return all(np.array_equal(first[key], second[key]) for key in first)

def compare_approximate(first, second):
    """Return whether two dicts of arrays are roughly equal"""
    if first.keys() != second.keys():
        return False
    return all(np.allclose(first[key], second[key]) for key in first)

# let's try them:
print(compare_exact(dct1, dct2))  # True
print(compare_exact(dct3, dct4))  # False
print(compare_approximate(dct3, dct4))  # True

As you can see in the above example, the integer arrays compare fine exactly, and depending on what you're doing (or if you're lucky) it could even work for floats. But if your floats are the result of any kind of arithmetic (linear transformations for instance?) you should definitely use an approximate check. For a complete description of the latter option please see the docs of numpy.allclose (and its elementwise friend, numpy.isclose), with special regard to the rtol and atol keyword arguments.

0

you can separate keys, values of both dicts and compare keys vs keys and values vs values: here's the solution:

import numpy as np

def dic_to_keys_values(dic):
    keys, values = list(dic.keys()), list(dic.values())
    return keys, values

def numpy_assert_almost_dict_values(dict1, dict2):
    keys1, values1 = dic_to_keys_values(dict1)
    keys2, values2 = dic_to_keys_values(dict2)
    np.testing.assert_equal(keys1, keys2)
    np.testing.assert_almost_equal(values1, values2)

dict1 = {"b": np.array([1, 2, 0.2])}
dict2 = {"b": np.array([1, 2, 3 * 0.1 - 0.1])}  # almost 0.2, but not equal
dict3 = {"b": np.array([999, 888, 444])} # completely different

numpy_assert_almost_dict_values(dict1, dict2) # no exception because almost equal
# numpy_assert_almost_dict_values(dict1, dict3) # exception because not equal

(note, above checks for exact keys and almost equal values)

rezan21
  • 1,065
  • 10
  • 14
-3

Consider this code

>>> import numpy as np
>>> np.identity(5)
array([[ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  1.]])
>>> np.identity(5)+np.ones([5,5])
array([[ 2.,  1.,  1.,  1.,  1.],
       [ 1.,  2.,  1.,  1.,  1.],
       [ 1.,  1.,  2.,  1.,  1.],
       [ 1.,  1.,  1.,  2.,  1.],
       [ 1.,  1.,  1.,  1.,  2.]])
>>> np.identity(5) == np.identity(5)+np.ones([5,5])
array([[False, False, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False]], dtype=bool)
>>> 

Note the the result of the comparison is a matrix, not a boolean value. Dict comparisons will compare values using the values cmp methods, which means that when comparing matrix values, the dict comparison will get a composite result. What you want to do is use numpy.all to collapse the composite array result into a scalar boolean result

>>> np.all(np.identity(5) == np.identity(5)+np.ones([5,5]))
False
>>> np.all(np.identity(5) == np.identity(5))
True
>>> 

You would need to write your own function to compare these dictionaries, testing value types to see if they are matricies, and then comparing using numpy.all, otherwise using ==. Of course, you can always get fancy and start subclassing dict and overloading cmp if you want too.

sirlark
  • 2,187
  • 2
  • 18
  • 28
  • I was not very clear about that, but I was hoping for a general way without explicitly checking for type. Today it's a numpy array, tomorrow it's a type I have never heard of today yet. – physicalattraction Oct 17 '14 at 09:50
  • I don't think there's a way around it, I'm afraid. If your (or numpy's or someone else's) types override __cmp__ to return a non-scalar, standard python comparisons won't handle it. – sirlark Oct 17 '14 at 10:20
  • You don't need to write your own function, because numpy has you covered. Please see vitiral's answer. – EL_DON Apr 12 '19 at 20:04