0

I have a json object which is returned against a post call

r = requests.post("url", json=data)

I was doing r.json() to get the json object. But as I understand it creates a dict object which is unordered. I need to preserve the order.

I saw the solution described here: Items in JSON object are out of order using "json.dumps"?

But my challenge is my starting point is a response object. How do I take it and convert to a json with the order preserved?

Adding some more details:

My API call returns an object of form:

[{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
}
]

I have a table with three columns as key01,key02 and keyN.

I need to post this json object after some minor manipulations to a software maintaining that specific order of key01,key02 and keyN.

But as soon as I do response.json() it is changing the order. I have tried to use the orderedlist approach as mentioned in the two other threads but so far my object is looking like this:

b"OrderedDict([('key01','value01'),('key02','value02'),('keyN','valueN')])

How do I get a json which looks like this instead: {"key01":"value01","key02":"value02","keyN":"valueN"}

jwodder
  • 54,758
  • 12
  • 108
  • 124
Anubis05
  • 1,234
  • 2
  • 13
  • 17
  • Possible duplicate of [Items in JSON object are out of order using "json.dumps"?](https://stackoverflow.com/questions/10844064/items-in-json-object-are-out-of-order-using-json-dumps) – Bailey Parker Jul 18 '18 at 23:23
  • In particular, on <3.7, make `data` an `OrderedDict` – Bailey Parker Jul 18 '18 at 23:23
  • @BaileyParker The OP is asking for the return json to be ordered (accessed by `r.json()`), they're not asking about `data`. – Taku Jul 18 '18 at 23:27
  • 2
    Indeed, I misread. I'd contest that relying on the order of json keys from a server (especially one you don't control, but even one you do) is pretty fragile. I'd recommend against it. If you want to iterate over the `dict` in a particular order, why not make a list of the keys `keys = ['first', 'second', ...]` and then `for k in key: r.json()[k]`? – Bailey Parker Jul 18 '18 at 23:30
  • Hi @BaileyParker thanks for your response. I have an internal process which takes a json object and maps it to a database table. The order of rows in the . table must match the exact sequence in the json. Unfortunately I cannot change the intermediate logic (it's a vendor product). The sequence of elements returned from the API is same as in the table. The response.json() method is changing it. – Anubis05 Jul 18 '18 at 23:42

2 Answers2

3

The requests doesn't have the best documentation, but from reading its source code on the .json() method, we can see that it's defined as followed:

def json(self, **kwargs):
    r"""Returns the json-encoded content of a response, if any.
    :param \*\*kwargs: Optional arguments that ``json.loads`` takes.
    :raises ValueError: If the response body does not contain valid json.
    """

    if not self.encoding and self.content and len(self.content) > 3:
        # No encoding set. JSON RFC 4627 section 3 states we should expect
        # UTF-8, -16 or -32. Detect which one to use; If the detection or
        # decoding fails, fall back to `self.text` (using chardet to make
        # a best guess).
        encoding = guess_json_utf(self.content)
        if encoding is not None:
            try:
                return complexjson.loads(
                    self.content.decode(encoding), **kwargs
                )
            except UnicodeDecodeError:
                # Wrong UTF codec detected; usually because it's not UTF-8
                # but some other 8-bit codec.  This is an RFC violation,
                # and the server didn't bother to tell us what codec *was*
                # used.
                pass
    return complexjson.loads(self.text, **kwargs)

where complexjson is the std json library or simplejson if you have that installed.

Knowing that, you can actually pass in keyword arguments to .json() which will directly go to json.loads(). Which means that you can do what the answer you linked proposed:


from collections import OrderedDict
r.json(object_pairs_hook=OrderedDict)

json.loads()

object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders. If object_hook is also defined, the object_pairs_hook takes priority.


simplejson.loads()

object_pairs_hook is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, collections.OrderedDict will remember the order of insertion). If object_hook is also defined, the object_pairs_hook takes priority.

Therefore, either way you will be able to provide the object_pairs_hook keyword argument to r.json().


From the information I received from the comments, you don't even need to parse the json, just do:

text = r.content.decode(requests.utils.guess_json_utf(r.content)).encode('utf-8')

and you can "post" text to wherever you desire.

Taku
  • 31,927
  • 11
  • 74
  • 85
  • Hi @abccd , thanks for your detailed description. Very helpful. I have added a little more details to the question. Does that help in anyway to solve my problem? – Anubis05 Jul 19 '18 at 00:11
  • Still... does `r.json(object_pairs_hook=OrderedDict)` not work? The object it returns is basically the ordered version of your desired `dict`. For Python<=3.5, `dict` are not ordered. – Taku Jul 19 '18 at 00:14
  • Nope :(. It is transforming my json to look like this: b"OrderedDict([('key01','value01'),('key02','value02'),('keyN','valueN')]) WHEREAS I need it to look like this: {"key01":"value01","key02":"value02","keyN":"valueN"}. Thats my original format in which it is in the response object BTW. – Anubis05 Jul 19 '18 at 00:19
  • 1
    @Anubis05 from reading your other comment, I think you want `text = r.content.decode(requests.utils.guess_json_utf(r.content)).encode('utf-8')` – Taku Jul 19 '18 at 00:22
  • BTW I am using python 3.5.2 – Anubis05 Jul 19 '18 at 00:22
  • @Anubis05 The code in my last comment basically does the folowing three steps: *"I would ideally like responseObj ->json ->utf-8 encoding"*, and now you just have the post `text`. – Taku Jul 19 '18 at 00:23
1

Relying on the order of json keys from a server (especially one that you don't control) is very fragile. The RFC says:

An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.

It also specifically comments:

JSON parsing libraries have been observed to differ as to whether or not they make the ordering of object members visible to calling software. Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

Thus, it is RFC compliant for an implementation (on the server) to change how it orders the produced JSON.

If you don't know if the server is using a serialization library that guarantees order, then this could break in the future (if the library changes). Even if you do, if that library takes the server language's equivalent of a dict, upgrading the language or standard library could change the semantics of that dict such that the ordering changes (and your code breaks). As an example, from Python 3.6 to 3.7 dicts changed from arbitrary order to ordered by insertion. In other languages such as rust, which seed the hash function used by their hashmaps to prevent DoS attacks, the ordering could be dependent on the randomness used to seed these hash functions (decided at runtime, and may be different if you, say, restart the server).

It's much more safe if you know that you need the data in a certain order to construct it that way yourself:

from collections import OrderedDict

ORDERED_KEYS = ['first', 'second', 'third']
ordered_json = OrderedDict((k, r.json()[k]) for k in ORDERED_KEYS)

From your comment, it seems like you need that dictionary serialized again. If you use json.dumps on an OrderedDict, the serialization will be in insertion order:

import json

serialized_ordered_json = json.dumps(ordered_json)
Community
  • 1
  • 1
Bailey Parker
  • 15,599
  • 5
  • 53
  • 91
  • I just tried this approach. This does what you described. But now my json object looks like this: b"OrderedDict([('key01','value01'),('key02','value02'),('keyN','valueN')]) . Which however MUST look like this: {"key01":"value01","key02":"value02","keyN":"valueN"}. Thats the only format which the next process accepts. I am getting my initial json in the same format as is expected at next steps. – Anubis05 Jul 19 '18 at 00:09
  • Or in another way to describe. I would ideally like responseObj ->json ->utf-8 encoding ->post the json . The json at last step need to look like this {"key01":"value01","key02":"value02","keyN":"valueN"} which is what I am getting in 2nd step. But since I am doing a response.json() to arrive at step 2, it is messing up the order. Can't I just take a json object as is from a response object, apply utf-8 and pass along to next steps? – Anubis05 Jul 19 '18 at 00:15
  • 1
    See my edit for how to get this back in a serialized form. – Bailey Parker Jul 19 '18 at 00:25