4

I need to efficiently have a diff dictionary that tells me what is the difference between a primary dictionary at some point at time, and now.

I need to have the full path to what is changed, not just the value that changed.

for example:

primary_dict = {'a': 5, 'b':{'c': 9}, 'd':10}

and the difference will be

diff_dict = {'a':6, 'b':{'c':8}}

to say that currently

primary_dict = {'a':6, 'b':{'c':8}, 'd':10}

I can have the diff created when values are added to the dict

I have looked online but only found comparisons between 2 dictionaries and that would inefficient because the dictionary I need to diff is massive and saving it 2 times and diffing it all recursively seems to much work for the problem at hand

EDIT: Changed the question to be more on point. like I have been notified the question that reflects my need is: How do I get changes to a dictionary over time without creating a new variable? Thank you to @CristiFati and @Vishnudev for the corrections

Eden K
  • 79
  • 6
  • 2
    What if some new keys appear, or some are deleted? – CristiFati Jun 16 '19 at 10:42
  • 1
    @RoadRunner: "*I need to have the **full path** to what is changed, not just the value that changed.*". – CristiFati Jun 16 '19 at 10:49
  • @CristiFati Ah my bad, didn't read fully. Too tired. – RoadRunner Jun 16 '19 at 10:50
  • @CristiFati sorry I didn't mention it, if keys appear I need to know about it. If keys are deleted it doesn't matter all too much – Eden K Jun 16 '19 at 10:59
  • @EdenK: What if they are removed? or what if they change type? – CristiFati Jun 16 '19 at 11:00
  • @CristiFati as far as changing types I guess it wouldn't change anything that doesn't depend on their types(a method that acts on the data). when you remove items, is there a good way to represent it in the same data type(namely dictionary)? – Eden K Jun 16 '19 at 11:09
  • @EdenK Could you re-phrase the exact question in a paragraph as I seem confused now due to changes and I think many people will be. – Vishnudev Krishnadas Jun 16 '19 at 11:20
  • What I need is a way to have only 2 dictionaries. for example the before_change dictionary and the diff dictionary. I do not need the after change dictionary what so ever. all I want to have is the diff between before_change when I started logging it and the current before_change. I don't want to save before_change and compare it to after_change because it is very inefficient. I hope it is more clear, thank you! – Eden K Jun 16 '19 at 11:24
  • Basically you want to record all the changes that occurred in a dictionary in time. That's a totally different question. – CristiFati Jun 16 '19 at 11:25
  • @EdenK What you asked is not what you are explaining now. Why is there an after_change dict in the question if you don't need it? Please be clear with your question. The actual question should have been `How do I get changes to a dictionary over time without creating a new variable` right? – Vishnudev Krishnadas Jun 16 '19 at 11:31
  • Please give at least a partial detailing of operations that can be applied to the original dictionary that you expect to track the resultant changes of. – גלעד ברקן Jun 17 '19 at 03:18

6 Answers6

1

Use the dictdiffer library

>>> from dictdiffer import diff
>>> before_change = {'a': 5, 'b':{'c': 9}, 'd':10}
>>> after_change = {'a': 6, 'b':{'c':8, 'd': 10}, 'd':10}
>>> list(diff(before_change, after_change))
[('change', 'a', (5, 6)), ('change', 'b.c', (9, 8)), ('add', 'b', [('d', 10)])]

For deleted keys,

>>> before_change = {'a': 5, 'b':{'c': 9}, 'd':10}
>>> after_change = {'a': 6, 'b':{'d': 10}, 'd':10}
>>> list(diff(before_change, after_change))
[('change', 'a', (5, 6)), ('add', 'b', [('d', 10)]), ('remove', 'b', [('c', 9)])]
Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55
  • Thank you, I have two points in regard to your answer: 1. I need to have a primary dictionary that will hold all of the information and a second dictionary that just tells me what changed, not 2 whole primary dictionaries that are different a little(in my case they will differ in 1 element) 2. I need to be aware of the time complexity of the algorithm to determine if it is not too cpu consuming. – Eden K Jun 16 '19 at 11:15
  • In the output dictionary you are expecting to get, you'll never know what got added and what got deleted @EdenK – Vishnudev Krishnadas Jun 16 '19 at 11:27
0

Probably OOPS may have a solution.

The dictionary may be only edited by a public function with a provision to create a log of what has changed.

pkrulz
  • 48
  • 7
0

Complex solution (based on set operations and recursive function) for extended, complex input dictionaries:

As the title said the condition of interest is "when inserting a value".

# extended input samples
before_change = {'a': 5, 'b': {'c': 9}, 'd': 10, 'g': {'h': {'k': 100}}}
after_change = {'a': 6, 'b': {'c': 9, 'f': 1}, 'd': 10, 'g': {'h': {'k': 300}}, 'e': 11}


def dict_diff(before, after):
    """Compute difference between dictionaries.
       Fetchs path to the innermost changed item"""

    new_keys = set(before) ^ set(after)
    diff = {k: after[k] for k in new_keys} # detecting new keys/items beforehand

    for k in set(before) & set(after):  # process intersected keys
        if before[k] != after[k]:
            if type(before[k]) is dict and type(after[k]) is dict:
                inner_diff = dict_diff(before[k], after[k])
                if inner_diff: diff[k] = inner_diff
            else:
                diff[k] = after[k]
    return diff

print(dict_diff(before_change, after_change))

The output:

{'e': 11, 'b': {'f': 1}, 'g': {'h': {'k': 300}}, 'a': 6}
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • Thank you but this isn't what I need, I need to specifically NOT have to dictionaries to diff. I need the primary dictionary with all of the information and the diff dictionary that just tells me what changed since I started to keep track. – Eden K Jun 16 '19 at 11:12
  • @EdenK, what you posted in your question as desired result is `{'a':6, 'b':{'c':8}}`. It's obliviously comprehensible as the difference between dictionaries – RomanPerekhrest Jun 16 '19 at 11:15
  • Thank you for the comment, I have now edited my question to make it clear what I need – Eden K Jun 16 '19 at 11:21
0

A similar question already exists: [SO]: How to get the difference between two dictionaries in Python?.
You could use a recursive approach.

code.py:

#!/usr/bin/env python3

import sys


def dummy_dict_diff(initial, final):
    ret = dict()
    for final_key, final_val in final.items():
        if final_key not in initial:  # Handle new keys
            ret[final_key] = final_val
            continue
        initial_val = initial[final_key]
        if final_val != initial_val:
            if type(final_val) != type(initial_val):
                ret[final_key] = final_val
            elif isinstance(final_val, (dict,)):
                ret[final_key] = dummy_dict_diff(initial_val, final_val)
            elif isinstance(final_val, (list, tuple)):
                ret[final_key] = final_val  # This would also require sequence diffs
            else:
                ret[final_key] = final_val
    deleted_keys = [item for item in initial if item not in final]  # Deleted keys
    for deleted_key in deleted_keys:
        ret[deleted_key] = initial[deleted_key]
    return ret


def main():
    before = {
        "a": 5,
        "b": {
            "c": 9
        },
        "d": 10
    }

    after = {
        "a": 6,
        "b": {
            "c": 8
        },
        "d": 10
    }

    print("Difference: {:}".format(dummy_dict_diff(before, after)))


if __name__ == "__main__":
    print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    main()
    print("\nDone.")

Notes:

  • There are some cases that are currently not handled. Handling them would require more code (and as an effect will also work slower):
    1. Sequence (e.g. list, tuple) values. Now, it's the final value (default case)

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q056617962]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32

Difference: {'a': 6, 'b': {'c': 8}}

Done.
CristiFati
  • 38,250
  • 9
  • 50
  • 87
0

You could use a dict of comprehension:

actual_dict = {'a': 5, 'b':{'c': 9}, 'd':10}
diff_dict = {'a':6, 'b':{'c':8}}

primary_dict = {x: actual_dict[x] if x not in diff_dict.keys() else diff_dict[x] for x in actual_dict.keys()}

print(primary_dict)

Output:

{'a': 6, 'b': {'c': 8}, 'd': 10}

Sebastien D
  • 4,369
  • 4
  • 18
  • 46
0

You can use the following python code for getting the difference between two dictionaries:

before_change = {'a': 5, 'b':{'c': 9}, 'd':10}
after_change = {'a': 6, 'b':{'c':8}, 'd':10}

bc_values = list(before_change.values())
ac_values = list(after_change.values())
ac_keys = list(after_change.keys())

for item in range(len(bc_values)):
 if ac_values[item] == bc_values[item] :
   after_change.pop(ac_keys[item])

print(after_change)

The output of this code is:

{'a': 6, 'b': {'c': 8}}