JSON allows duplicate keys in objects. The Python JSON parser by default ignores any but the last duplicate, but has a mechanism for overriding this behaviour which I need to use to detect and report such issues. Basically, the code will be something like load(file_pointer, object_pairs_hook=report_duplicates)
.
To test this behaviour I need to create a JSON object with duplicate keys. Is there some simple way to do that in Python 3.8 without hand-coding the JSON string? For example, could I convert [("x": 1), ("x": 2)]
into a JSON string {"x": 1, "x": 2}
? (The order doesn't matter.) The reason is that I need the JSON object to have a specific structure, so the test will be noisy if I have to type the whole thing into a string.
Since this is specifically about duplicate keys I can't use solutions like dumps(dict(…))
.
So far I have this:
class KeyValueList(list):
pass
class KeyValueListEncoder(JSONEncoder):
def iterencode(self, obj: Any, _one_shot: bool = False) -> Iterator[str]:
if isinstance(obj, KeyValueList):
yield "{"
max_index = len(obj) - 1
for index, (key, value) in enumerate(obj):
yield JSONEncoder.encode(self, key)
yield self.key_separator
yield JSONEncoder.encode(self, value)
if index != max_index:
yield self.item_separator
yield "}"
else:
for chunk in JSONEncoder.iterencode(self, obj, _one_shot=_one_shot):
yield chunk
This works if the top-level object being serialized is a KeyValueList
, but if the KeyValueList
is nested in another data structure it's serialized as a list
:
>>> dumps(KeyValueList((("x", 1), ("x", 2))), cls=KeyValueListEncoder)
'{"x": 1, "x": 2}'
>>> dumps({"foo": KeyValueList((("x", 1), ("x", 2)))}, cls=KeyValueListEncoder)
'{"foo": [["x", 1], ["x", 2]]}'
(I can't override default
as per the documentation because default
is only called when a type is not already serializable.)