0

I have a JSON file that contains a string. I want to update my JSON file with key-value pair such that I can add a key corresponding to that string. I want to do this if only a string/value is there.

At the same time, I want to add a new column id where id is a number that will automatically be updated based on the number of files provided.

I am not sure what to do in that case :(
How can we write a python script to do the following changes?

Example:

  • File1.json contains
    "I\nhave\na\ncat"
    
    Expected output: (File1.json)
    {id: "1", string :"I\nhave\na\ncat"}
    
  • File2.json
    "I\nhave\na\ndream"
    
    Expected output: (File2.json)
    {id: "2", string :"I\nhave\na\ndream"}
    
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
Yumiko
  • 3
  • 2
  • 1
    To clarify your terminology, JSON structure/format has no "columns". It's a hierarchical structure containing key-value pairs: https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Objects/JSON#json_structure. In Python, it's represented as a dictionary. – Gino Mempin May 21 '22 at 01:07
  • Welcome to Stack Overflow. Where exactly are you stuck? Are you able to read and write JSON files? Do you understand what you get from reading the file with the standard library JSON module? Do you understand what kind of data you should prepare, in order to write a file with the standard library JSON module? Given the string in question, do you know how to create the dictionary you want? – Karl Knechtel May 21 '22 at 01:13
  • As an aside, the input `.json` files you describe *are* valid JSON - they're just not how the format is normally used. – Karl Knechtel May 21 '22 at 01:20

1 Answers1

0

To work with JSON data and files in Python, you use the json module. It has methods for:

  • load-ing JSON data from files and converting them to Python objects
  • dump-ing Python objects to files in JSON format

In Python, JSON is represented as a regular dictionary, so you simply have to read the string from the file, turn it into a dictionary, add any other key-value pairs you want or even modify the data, then dump it back to the file.

Now since you only want to do this conversion if the file contains only a string, you can first do json.load, then use isinstance to check if it was converted to a dictionary. If yes, then it's already in proper JSON so you don't have to do anything. If no, and it was converted to a string, then continue with the processing.

Lastly, since you want to overwrite the same file, open the file in "r+" mode for reading and writing.

import json

# Assume script and files are in the same directory
filepaths = [
    "File1.json",  # Contains "I\nhave\na\ncat"
    "File2.json",  # Contains "I\nhave\na\ndream"
    "File3.json",  # Correct JSON
]

# Process each file one-by-one
for file_id, filepath in enumerate(filepaths):
    with open(filepath, "r+") as f:
        contents = json.load(f)
        if isinstance(contents, dict):
            # This is already in JSON, nothing to fix
            continue
        # We need to fix the file
        data = {"id": file_id, "string": contents}
        f.seek(0)
        json.dump(data, f)
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
  • Thanks Gino. If I want to start from specific id number such as 600 onwards then what changes I can make to your code? – Yumiko May 26 '22 at 20:26
  • @Yumiko Check the docs for `enumerate`, it accepts a `start` keyword parameter, which by default is 0, then set it to whichever start you need: https://docs.python.org/3/library/functions.html#enumerate – Gino Mempin May 26 '22 at 21:11