28
>>> raw_post_data = request.raw_post_data
>>> print raw_post_data
{"group":{"groupId":"2", "groupName":"GroupName"}, "members":{"1":{"firstName":"fName","lastName":"LName","address":"address"},"1": {"firstName":"f_Name","lastName":"L_Name","address":"_address"}}}
>>> create_request = json.loads(raw_post_data)
>>> print create_request
{u'group': {u'groupName': u'GroupName', u'groupId': u'2'}, u'members': {u'1': {u'lastName': u'L_Name', u'firstName': u'f_Name', u'address': u'_address'}}}

As you can see members with key '1' is overwritten when I use json.dumps()

Is there any way to catch it as exception in python, saying found duplicate keys in request from client ?

MattDMo
  • 100,794
  • 21
  • 241
  • 231
Anuj Acharya
  • 546
  • 2
  • 9
  • 18
  • 2
    related: [SimpleJson handling of same named entities](http://stackoverflow.com/questions/7825261/simplejson-handling-of-same-named-entities) – jfs Feb 15 '13 at 20:00

3 Answers3

40

The rfc 4627 for application/json media type recommends unique keys but it doesn't forbid them explicitly:

The names within an object SHOULD be unique.

From rfc 2119:

SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

import json

def dict_raise_on_duplicates(ordered_pairs):
    """Reject duplicate keys."""
    d = {}
    for k, v in ordered_pairs:
        if k in d:
           raise ValueError("duplicate key: %r" % (k,))
        else:
           d[k] = v
    return d

json.loads(raw_post_data, object_pairs_hook=dict_raise_on_duplicates)
# -> ValueError: duplicate key: u'1'
Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Yes, that what i was looking for..Thx.. However, json.load() library should provide something that could do similar.. – Anuj Acharya Feb 15 '13 at 22:11
  • 2
    @AnujAcharya:The problem is that there are good use cases for a plain dict, a "multidict", a "multi-only-on-dups-dict", a "raise-on-dups-dict" (with ValueError, or KeyError?), and possibly others. And you want the exact same thing in `json.loads` and `json.load`, and `csv.DictReader`, and `yaml.load`, and so on. (See the current discussion on python-ideas about `csv`.) You don't want to write all possible behaviors for all possible load functions. And `object_pairs_hook` seems like exactly the right way to decouple it. – abarnert Feb 15 '13 at 23:07
  • I really do not understand what is ordered_pair and raw_post_data.Could you explain please your parameters. Nearly every web page I see your solution that other people made copy-paste. No explanation. As I am new in python, I need to know more info. – limonik Feb 15 '19 at 13:10
  • 1
    @limonik see [`json.loads` docs](https://docs.python.org/3.7/library/json.html). `raw_post_data` is a json text from the question. `ordered_pairs` is a (key, value) iterable corresponding to a json object that is parsed into a Python `dict` usually. – jfs Feb 15 '19 at 20:14
3

One alternative I wrote based on the solution posted by other users of this question is to convert those duplicates into an array:

def array_on_duplicate_keys(ordered_pairs):
    """Convert duplicate keys to arrays."""
    d = {}
    for k, v in ordered_pairs:
        if k in d:
            if type(d[k]) is list:
                d[k].append(v)
            else:
                d[k] = [d[k],v]
        else:
           d[k] = v
    return d

And then:

dict = json.loads('{"x": 1, "x": 2}', object_pairs_hook=array_on_duplicate_keys)

gives you the output:

{'x': [1, 2]}

Later one, one can check easily how meany duplicates an entry has by using:

if type(dict['x']) is list:
    print('Non-unique entry in dict at x, found', len(dict['x']),'repetitions.')
ferdymercury
  • 698
  • 4
  • 15
1

Alternatively if you want to catch all the duplicate keys (per level) you can use a collections.Counter

from collections import Counter

class KeyWatcher(dict):

    def __init__(self, *args):
        duplicates = [d for d,i in Counter([pair[0] for pair in args[0]]).items() if i > 0]
        if duplicates:
            raise KeyError("Can't add duplicate keys {} to a json message".format(duplicates))
        self.update(*args[0])

json.loads(raw_post_data, object_pairs_hook=KeyWatcher)
autholykos
  • 836
  • 8
  • 14
  • 1
    Your counter is counting number of occurences, so to find keys that appear more than once (i.e. duplicates) the condition in the list comprehension should be `if i > 1` not `if i > 0`. – Michael Currie Nov 22 '15 at 21:02
  • Actually, even after that correction it still does not appear to work as advertised. J.F. Sebastian's code worked, though. I recommend using it, even if there appears to be some elegance to this approach in that it uses list comprehensions instead of loops. – Michael Currie Nov 22 '15 at 21:12
  • A quick fix would be `self.update(args[0])`, i.e. without the asterisk. The KeyWatcher is called with just one argument, thus `*args` is not helpfull at all. – VPfB Jan 19 '17 at 08:17