How do I maintain the same structure when reading from, modifying and writing back to a JSON file?

Question

I am currently reading in a JSON file, adding a key and writing it back out to the same file using this procedure

with open('data.json', 'r+') as f:
    data = json.load(f)
    temp_key={"test":"val"}
    data["test"]["new_key"] = temp_key
    f.seek(0)        # <--- should reset file position to the beginning.
    json.dump(data, f, indent=2)
    f.truncate()     # remove remaining part

(adopted from here)

but the issue is that it does not maintain order. for instance if I read in:

{
  "test": {
    "something": "something_else"
  },
  "abc": {
    "what": "huh"
  }
}

the output turns out as:

{
  "abc": {
    "what": "huh"
  },
  "test": {
    "something": "something_else",
    "new_key": {
      "test": "val"
    }
  }
}

When I would like it to be:

{
  "test": {
    "something": "something_else",
    "new_key": {
      "test": "val"
    }
  },
  "abc": {
    "what": "huh"
  }
}

I realise that JSON is a key/value based structure and the order does not matter, but is there a way of making the modification and maintaining the original structure?

@user2883071 As you mentioned, order is not part of the JSON spec, so why does it matter to you? If you rely on your JSON data being ordered then this isn't valid anyway. — a_guest, Oct 23 '19 at 22:00
@a_guest It matters if you want to have a readable diff between the previous and the new version of the file — wim, Oct 24 '19 at 01:25
@wim That purely relies on implementation details. If the `json` module shuffled each dict before dumping that would be perfectly valid too (according to the JSON specs). If you want to diff the _data_ (i.e. the contained information, not the files) then you should first parse, then diff the parsed data structures. A diff between two JSON _files_ is not meaningful. — a_guest, Oct 25 '19 at 09:53
@a_guest I get it. Yet in practice many people use json format for configuration files, and store these files in git. Common tools such as github’s web UI don’t offer you an order-agnostic diff, so it’s pretty understandable to want an order-preserving load-and-dump roundtrip. Pragmatic. — wim, Oct 25 '19 at 14:12

martineau · Accepted Answer · 2019-10-24T01:22:48.183

As I said in a comment, you can use a collections.OrderedDict along with the optional object_pairs_hook keyword argument accepted by json.load() (in Python 2.7) to preserve the order of the original data when you rewrite the file.

This is what I meant:

#!/usr/bin/env python2
from collections import OrderedDict
import json


with open('ordered_data.json', 'r+') as f:
    data = json.load(f, object_pairs_hook=OrderedDict)
    temp_key = {"test": "val"}
    data["test"]["new_key"] = temp_key
    f.seek(0)  # Reset file position to the beginning.
    json.dump(data, f, indent=2)
    f.truncate()  # Remove remaining part.

How do I maintain the same structure when reading from, modifying and writing back to a JSON file?

1 Answers1