0

I would like to ask if there is an easy way to modify JSON by using Python?

I have found some of the relevant topic- How to update json file with python But could not figure out the solution for my current issue.

Currently, JSON looks like this:

    {
         "X": [
           {
               "sample_topic_x":"sample_content_x_1",
               ...
           }
           {
               "sample_topic_x":"sample_content_x_2",
               ...
           }
           ......
         ]
         "Y": [
           {
               "sample_topic_y":"sample_content_y_1",
               ...
           }
           {
               "sample_topic_y":"sample_content_y_2",
               ...
           }
           ......
         ]
}

Required: To be accepted by BQ / Need to remove "Y", keep only "X" in this format.

{"sample_topic_x":"sample_content_x_1",.....}
{"sample_topic_x":"sample_content_x_2",.....}
{"sample_topic_x":"sample_content_x_3",.....}

Any relevant documentation, topics?

P.S> Update 1.0

import json
json_path = 'C:\XXX\exportReport.json'

def updateJsonFile():
jsonFile = open(json_path, "r") # Open the JSON file for reading
data = json.load(jsonFile) # Read the JSON into the buffer
jsonFile.close() # Close the JSON file

updateJsonFile()
Den
  • 171
  • 3
  • 5
  • 13
  • Your required format is not a valid JSON. – Rakmo Jun 18 '18 at 10:28
  • @OmkarDeshpande This file generated by using API, and thus the output of JSON file. I need this specific format, as BigQuery accept only this way (As tested), but this could be not right as well. – Den Jun 18 '18 at 10:44
  • `json.load` will not work in the required format. – Rakmo Jun 18 '18 at 10:49
  • @OmkarDeshpande What could I do make this happens? – Den Jun 18 '18 at 10:51
  • Check my answer. – Rakmo Jun 18 '18 at 10:59
  • Have a look at the [json](https://docs.python.org/3/library/json.html) module. With loads you could read the json, do your modifications on Python dicts or lists and then write it back using dumps. – malukas Jun 18 '18 at 10:15

2 Answers2

0

Solution:

import json
json_path = 'C:\XXX\exportReport.json'
output_path = 'C:\XXX\your_output_file.txt'

with open(json_path) as f:
    data = json.loads(f.read())

# Opening output file in append mode
# Note: Output file is not JSON, as the required format is not valid json
with open(output_file, "a+") as op:
    for element in data.get('X'):
        op.write(json.dumps(element) + "\n")

Explanation:

Load the input json file using json.loads. The output file will be a plain text file and not a JSON file as the required output format is not a valid JSON. Use a .txt file for storing the output. Store value of json.loads() in data. To get inner element X which is a list of dictionaries, use data.get('X'), which will return list. Iterate over it and write json.dumps() to the output file, each element in a newline.

C:\XXX\exportReport.json

{

    "X": [
       {
           "sample_topic_x":"sample_content_x_1",
           ...
       }
       {
           "sample_topic_x":"sample_content_x_2",
           ...
       }
       ......
     ]
     "Y": [
       {
           "sample_topic_y":"sample_content_y_1",
           ...
       }
       {
           "sample_topic_y":"sample_content_y_2",
           ...
       }
       ......
     ]
}

C:\XXX\your_output_file.txt

{"sample_topic_x":"sample_content_x_1",.....}
{"sample_topic_x":"sample_content_x_2",.....}
{"sample_topic_x":"sample_content_x_3",.....}
Rakmo
  • 1,926
  • 3
  • 19
  • 37
-1

You need extract parent data at first. And definite this to a variable. And search "X" data in this variable.

  • I understand the logic how this could be done, but do you know the documentation about JSON structure modification with Python? As I could not find any relevant source to modify structure, but alot about change the content. – Den Jun 18 '18 at 10:11