12

I want to take two dictionaries and print a diff of them. This diff should include the differences in keys AND values. I've created this little snippet to achieve the results using built-in code in the unittest module. However, it's a nasty hack since I have to subclass unittest.TestCase and provide a runtest() method for it to work. In addition, this code will cause the application to error out since it will raise an AssertError when there are differences. All I really want is to print the diff.

import unittest
class tmp(unittest.TestCase):
    def __init__(self):
         # Show full diff of objects (dicts could be HUGE and output truncated)
        self.maxDiff = None
    def runTest():
        pass
_ = tmp()
_.assertDictEqual(d1, d2)

I was hoping to use the difflib module, but it looks to only work for strings. Is there some way to work around this and still use difflib?

mgilson
  • 300,191
  • 65
  • 633
  • 696
durden2.0
  • 9,222
  • 9
  • 44
  • 57
  • possible duplicate of http://stackoverflow.com/questions/1165352/fast-comparison-between-two-python-dictionary – Mark Reed Oct 18 '12 at 14:32
  • @MarkReed -- This is different. That asked for the differences in *keys*, this asks for the difference in *keys* and *values* (I assume the OP needs *key-value* pairs) -- e.g. `{1:2, 2:3}` is different from `{1:3,2:2}`, but that's not actually explicitly stated... – mgilson Oct 18 '12 at 14:36
  • @mgilson - I didn't put in a close request or mark this as a duplicate, but if you look at the accepted answer on that page, it includes value comparison, not just keyset comparison. – Mark Reed Oct 18 '12 at 14:37
  • I don't necessarily mind this solution. However, is there a way to clean it up a bit? Two things I would want to change: 1. Catch/suppress AssertError (easy to do but seems weird) 2. Use `assertDictEqual` without having to subclass `TestCase`. – durden2.0 Oct 18 '12 at 14:59

8 Answers8

10

Adapted from the cpython source:

https://github.com/python/cpython/blob/01fd68752e2d2d0a5f90ae8944ca35df0a5ddeaa/Lib/unittest/case.py#L1091

import difflib
import pprint

def compare_dicts(d1, d2):
    return ('\n' + '\n'.join(difflib.ndiff(
                   pprint.pformat(d1).splitlines(),
                   pprint.pformat(d2).splitlines())))
user2733517
  • 187
  • 1
  • 9
4

You can use difflib, but the use unittest method seems more appropriate to me. But if you wanted to use difflib. Let's say say the following are the two dicts.

In [50]: dict1
Out[50]: {1: True, 2: False}

In [51]: dict2
Out[51]: {1: False, 2: True}

You may need to convert them to strings (or list of strings) and then go about using difflib as a normal business.

In [43]: a = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict1.items())])
In [44]: b = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict2.items())])
In [45]: print a
1:True
2:False
In [46]: print b
1:False
2:True
In [47]: for diffs in difflib.unified_diff(a.splitlines(), b.splitlines(), fromfile='dict1', tofile='dict2'):
    print diffs

THe output would be:

--- dict1

+++ dict2

@@ -1,2 +1,2 @@

-1:True
-2:False
+1:False
+2:True
Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131
2

You can use .items() along with sets to do something like this:

>>> d = dict((i,i) for i in range(10))
>>> d2 = dict((i,i) for i in range(1,11))
>>>
>>> set(d.items()) - set(d2.items())
set([(0, 0)])
>>>
>>> set(d2.items()) - set(d.items())
set([(10, 10)])
>>>
>>> set(d2.items()) ^ set(d.items())  #symmetric difference
set([(0, 0), (10, 10)])
>>> set(d2.items()).symmetric_difference(d.items())  #only need to actually create 1 set
set([(0, 0), (10, 10)])
mgilson
  • 300,191
  • 65
  • 633
  • 696
1

I found a library (not very well documented) called datadiff which gives out the diffs of hashable data structures in python. you can install it with pip or easy_install. Give it a try!

softwareplay
  • 1,379
  • 4
  • 28
  • 64
0

See Python recipe to create difference (as dictionary) of two dictionaries. Could you describe what the output should looks like (please attach an example)?

Chris Medrela
  • 586
  • 3
  • 8
  • I'm not really set on a printing format. I guess something that looks like standard unidiff would be nice. However, the printing of assertDiffEqual() is not bad. Here are some examples:AssertionError: {'a': 'a'} != {'a': 'abc'} - {'a': 'a'} + {'a': 'abc'} ? ++ AssertionError: {'a': 1} != {'a': 2} - {'a': 1} ? ^ + {'a': 2} ? ^ – durden2.0 Oct 18 '12 at 14:52
0

using @mgilson's solution and taking it a step further for the OP's request to work with unittest module.

def test_dict_diff(self):
    dict_diff = list(set(self.dict_A.items()).symmetric_difference(set(self.dict_B.items()))))
    fail_message = "too many differences:\nThe differences:\n" +
                   "%s" % "\n".join(dict_diff)
    self.assertTrue((len(dict_diff) < self.maxDiff), fail_message)
Community
  • 1
  • 1
Inbar Rose
  • 41,843
  • 24
  • 85
  • 131
  • This is pretty neat, but I think using the built-in `assertDictEqual()` is a better solution if the code is already using the `unittest` module. – durden2.0 Oct 19 '12 at 13:36
  • `assertDictEqual` is in 2.7+ i use 2.6 ... and so i assume everyone does. :P but yes, that would be better to use. – Inbar Rose Oct 21 '12 at 07:54
0

Check out https://github.com/inveniosoftware/dictdiffer

print list(diff(
    {2014: [
        dict(month=6, category=None, sum=672.00),
        dict(month=6, category=1, sum=-8954.00),
        dict(month=7, category=None, sum=7475.17),
        dict(month=7, category=1, sum=-11745.00),
        dict(month=8, category=None, sum=-12140.00),
        dict(month=8, category=1, sum=-11812.00),
        dict(month=9, category=None, sum=-31719.41),
        dict(month=9, category=1, sum=-11663.00),
    ]},

    {2014: [
       dict(month=6, category=None, sum=672.00),
       dict(month=6, category=1, sum=-8954.00),
       dict(month=7, category=None, sum=7475.17),
       dict(month=7, category=1, sum=-11745.00),
       dict(month=8, category=None, sum=-12141.00),
       dict(month=8, category=1, sum=-11812.00),
       dict(month=9, category=None, sum=-31719.41),
       dict(month=9, category=1, sum=-11663.00),
    ]}))

gives this output which I think is pretty great:

[('change', ['2014', 4, 'sum'], (-12140.0, -12141.0))]

i.e. it gives what happened: a value "changed", the path "['2014', 4, 'sum']" and that it changed from -12140.0 to -12141.0.

boxed
  • 3,895
  • 2
  • 24
  • 26
0

This function returns a string and a flattened dictionary with the differences between the dictionaries

from collections.abc import MutableMapping
import pandas as pd
def get_dict_value_differences(current_dict, past_dict):
    """
    find the added keys and different values between the dictionaries
    :param current_dict:
    :param past_dict:
    :return: flattened dictionary of changed values
    """
    current_flat_dict = flatten_dict(current_dict)
    past_flat_dict = flatten_dict(past_dict)

    flat_diff_dict = dict()
    for key, value in current_flat_dict.items():
        if key in current_flat_dict.keys() and key in past_flat_dict.keys():
            if current_flat_dict[key] != past_flat_dict[key]:
                flat_diff_dict[key] = current_flat_dict[key]
        elif key in current_flat_dict.keys():
            flat_diff_dict[key] = current_flat_dict[key]

    diff_str = str(*[str(k) + ':' + str(v) for k, v in flat_diff_dict.items()])
    return flat_diff_dict, diff_str


def flatten_dict(d: MutableMapping, sep: str= '.') -> MutableMapping:
    [flat_dict] = pd.json_normalize(d, sep=sep).to_dict(orient='records')
    return flat_dict
lahmania
  • 361
  • 2
  • 5