1

I've come across a peculiar issue while working with the deepdiff library in Python. I'm trying to discern the differences between two dictionaries to ultimately generate a delta. Here's the background:

I have two dictionaries:

d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}

To compare them, I've used the following code:

from deepdiff import DeepDiff, Delta

deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_repetition=True)

The output I received was:

{'repetition_change': {"root['a'][0]": {'old_repeat': 3, 'new_repeat': 4, 'old_indexes': [0, 1, 2], 'new_indexes': [0, 1, 2, 3], 'value': {'id': 1}}}}

This output aligns with my expectations. Since I specified that the 'id' fields should not be compared, deepdiff identifies the additional element as a repetition rather than as a unique entry.

Finally, when applying the delta to d2:

result = d2 + Delta(deep_diff_result)

The resulting dictionary was:

{'a': [{'id': 1}, {'id': 1}, {'id': 1}, {'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}

Problem Statement:

From my perspective, I anticipated the only difference between d1 and d2 to be the {'id': 4} entry. However, I ended up with four repetitions of {'id': 1}. I'm curious as to why this occurs.

What I've tried:

I've searched Stack Overflow for similar issues, but couldn't find any that matched my exact situation. I've reviewed open and closed issues on the deepdiff GitHub repository, but did not identify any scenarios similar to mine.

Related Research: I've looked through https://zepworks.com/deepdiff/current/delta.html, but it didn't provide clarity for this particular case.

Could someone shed light on this behavior, and possibly guide me to achieve the desired result? Thanks in advance!

Kfir Cohen
  • 43
  • 6

0 Answers0