0

I have JSON file that is formatted like this: (multi-line for clarity)

(line 0001).......

{
    "_id": "iD_0001",
    "skills": [{
        "name": "Project Management"
    }, {
        "name": "Business Development"
    }]
}

.... (line 9999)

{
    "_id":"iD_9999",
    "skills": [{
        "name": "Negotiation"
    }, {
        "name": "Banking"
    }]
}

I'd like to run a program on it, however, the program cannot read it under the aforementioned format. Thus I'd like to modify its format to:

[{
    "_id": "iD_0001",
    "skills": [{
        "name": "Project Management"
    }, {
        "name": "Business Development"
    }]
},{
    "_id":"iD_9999",
    "skills": [{
        "name": "Negotiation"
    }, {
        "name": "Banking"
    }]
}]

Essentially, putting all entries in a single array. Is there a way to implement that using Python or demjson?

ALTERNATIVE: I made a program that fetches the skills in these json files and sends them to a text file (Test.txt), however it only works for the second format, not the first. Can you suggest a modification to make it work for the first format (above)? This is my program:

import json
from pprint import pprint
with open('Sample.json') as data_file:    
    data = json.load(data_file)

    with  open('Test.txt', 'w') as f:
        for x in data:
            for y in x["skills"]: 
                    f.write(y["name"])
        f.close()

SOLUTION

Thank you to Antti Haapala for noticing the catenation of Json objects under the first format, as well as to Walter Witzel and Josh J for suggesting alternative answers. Since the first format is a catenation of individual objects, the program functions well if we load the first Json file Line-by-Line instead of as a whole. I have done that with:

data = []
with open('Sample1-candidats.json') as data_file:    
for line in data_file:
    data.append(json.loads(line))

    with  open('Test.txt', 'w') as f:
        for x in data:
            for y in x["skills"]: 
                    f.write(y["name"])
        f.close()
Jasp B
  • 1
  • 1
  • 1
    The first one is not a single JSON file. `json.load` loads just one object from file at a time. The first one is catenation of multiple JSON objects. – Antti Haapala -- Слава Україні Mar 14 '16 at 17:26
  • Please check this: http://stackoverflow.com/questions/8730119/retrieving-json-objects-from-a-text-file-using-python – Walter_Ritzel Mar 14 '16 at 17:39
  • In your example file, is there a literal `.....` in between each record or were you using that as an example to shorten the copy/paste? – Josh J Mar 14 '16 at 17:44
  • the `.....` is to shorten the copy-paste. not literal – Jasp B Mar 14 '16 at 19:34
  • Just one small observation: your solution assumes information that you have not shared on your question (the fact that you have one json object per line). The way you have presented the json sample, we have assumed that you have \n characters breaking the lines. – Walter_Ritzel Mar 14 '16 at 19:52
  • You are right, I will edit the post soon to present the answers more comprehensively. – Jasp B Mar 14 '16 at 19:58

2 Answers2

0

Here it goes. This assumes that your file is just a bunch of individual json objects concatenated and you need to transform in a list of json objects.

import json
from pprint import pprint

with open('sample.json') as data_file:    
    strData = '[' + ''.join(data_file.readlines()).replace('}\n{','},{') + ']'
    data = eval(strData)

with  open('Test.txt', 'w') as f:
    for x in data:
        for y in x["skills"]: 
            f.write(y["name"])
Walter_Ritzel
  • 1,387
  • 1
  • 12
  • 16
0

Here are the steps you can take to accomplish your problem. Since it kinda sounds like a homework assignment, I will give you the logic and pointers but not the code.

  1. Open the file for reading
  2. Read file into string variable (if small enough for memory limits)
  3. Create empty list for output
  4. Split string on .....
  5. json.loads each piece of resulting list
  6. Append each result to your empty output list
  7. Have a cup of coffee to celebrate
Josh J
  • 6,813
  • 3
  • 25
  • 47
  • 1
    Unfortunately this is not a homework, and the files might become too large for memory limits. I see your reasoning though, thank you! – Jasp B Mar 14 '16 at 19:39