External libraries are allowed but less preferred.
Example input:
data.json
with content:
{
"name": "John",
"age": 30,
"address": {
"street": "123 Main St",
"city": "New York",
"street": "321 Wall St"
},
"contacts": [
{
"type": "email",
"value": "john@example.com"
},
{
"type": "phone",
"value": "555-1234"
},
{
"type": "email",
"value": "johndoe@example.com"
}
],
"age": 35
}
Example expected output:
Duplicate keys found:
age (30, 35)
address -> street ("123 Main St", "321 Wall St")
Using json.load/s as is returning a standard Python dictionary will remove duplicates so I think we need a way to "stream" the json as it's loading in some depth first search / visitor pattern way.
I've also tried something similar to what was suggested here: https://stackoverflow.com/a/14902564/8878330 (quoted below)
def dict_raise_on_duplicates(ordered_pairs):
"""Reject duplicate keys."""
d = {}
for k, v in ordered_pairs:
if k in d:
raise ValueError("duplicate key: %r" % (k,))
else:
d[k] = v
return d
The only change I made was instead of raising, I appended the duplicate key to a list so I can print the list of duplicate keys at the end.
The problem is I don't see a simple way to get the "full path" of the duplicate keys