3

I have constructed a json object from a data stream where each user will have a json object.

{
    "entities": {
        "description": {
            "urls": []
        }
    },
    "utc_offset": -10800,
    "id"
    "name": "Tom",
    "hit_count": 7931,
    "private": false,
    "active_last_month": false,
    "location": "",
    "contacted": false,
    "lang": "en",
}

Objective: I want to construct a json file where each json object become a line in a file with the indentation. And when it comes to reading back the JSON file it can be read using with open:

for example: Following File

[
{
    "entities": {
        "description": {
            "urls": []
        }
    },
    "utc_offset": -10800,
    "id"
    "name": "Tom",
    "hit_count": 7931,
    "private": false,
    "active_last_month": false,
    "location": "",
    "contacted": false,
    "lang": "en",
}
,
{
    "entities": {
        "description": {
            "urls": []
        }
    },
    "utc_offset": -10500,
    "id"
    "name": "Mary",
    "hit_count": 554,
    "private": false,
    "active_last_month": false,
    "location": "",
    "contacted": false,
    "lang": "en",
}
]

Above file can easily read by:

with open(pathToFile) as json_file:
     json_data = json.load(json_file)
     for key in json_data:
         print key["id"]

But at the moment here is how I am writing constructing the json file:

with open(root + filename + '.txt', 'w+') as json_file:
        # convert from Python dict-like structure to JSON format
        jsoned_data = json.dumps(data)
        json_file.write(jsoned_data)
        json_file.write('\n')

This gives me

{
   indented json data
}
{
   indented json data
}

PS: notice brackets [] are not there along with ,

When you try to read this same code structure as

with open(pathToFile) as json_file:
     json_data = json.load(json_file)
     for key in json_data:
         print key["id"]

you end up getting errors: ValueError: Extra data: line 101 column 1 - line 1889 column 1

add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
  • 2
    Not possible. You can't blindly append to a JSON file without breaking it. JSON is a serialized data structure, it is not plain text. If you want to modify it you will have to read file / parse / modify / serialize / write file, everything else will be an ugly hack that is bound to break at some point. You might want to look at document-oriented databases (something like CouchDB) for this kind of CRUD work. – Tomalak Aug 12 '16 at 16:27
  • the example with brackets and `,` was a file download from web. I know i am asking you to guess. Do you think it came from a CouchDB or MongoDB dump? – add-semi-colons Aug 12 '16 at 16:46
  • It does not matter where it came from. JSON files are atomic, you can only handle them as a whole or not at all. If you want to handle the individual bits inside them individually, use a tool that has been made for this task. Alternatively use many small files. – Tomalak Aug 12 '16 at 16:55
  • @Tomalak: perfectly possible, just not very desirable. See [Loading and parsing a JSON file with multiple JSON objects in Python](//stackoverflow.com/q/12451431). JSON Lines would make this much easier, adding newlines between the JSON documents, and making sure the documents themselves are newline-free. – Martijn Pieters Jan 03 '19 at 14:27
  • @Martijn But that would be a new file format that is no longer JSON. :) – Tomalak Jan 03 '19 at 15:22
  • @Tomalak: see http://jsonlines.org/, it's a subset of JSON (only change is that there are no newlines used in formatting the document, and only UTF-8 is used, no other UTF encodings), embedded in a newline-separated file. – Martijn Pieters Jan 03 '19 at 15:26
  • The question was *"How to store multiple JSON strings in the same file, then load that file with `json.load`?"* and my "not possible" strictly referred to that notion. Introducing constraints to the JSON (thereby violating the spec, if only mildly, like in this case) and *not* using `json.load` is certainly possible, no argument there at all. – Tomalak Jan 03 '19 at 16:07

1 Answers1

-3

I think you should add the new JSON data to a list containing the previous data:

newData = {"entities": {
    "description": {
        "urls": []
    }
},
"utc_offset": -10500,
"id": 3,
"name": "Mary",
"hit_count": 554,
"private": False,
"active_last_month": False,
"location": "",
"contacted": False,
"lang": "en"
}

def readFile():
    with open('testJson', 'r') as json_file:
        json_data = json.load(json_file)
        for key in json_data:
            print key["id"]

def writeFile():
    with open('testJson', 'r') as json_file:
        oldData = json.load(json_file)
    with open('testJson', 'w+') as json_file:
        # convert from Python dict-like structure to JSON format
        data = oldData.append(newData)
        jsoned_data = json.dumps(oldData, indent=True)
        json_file.write(jsoned_data)

if __name__ == '__main__':
    readFile()
    writeFile()
    readFile()
Robert Wisner
  • 155
  • 1
  • 5