0

I have a JSON file in non-standard format, like this:

{
    "color": "black",
    "category": "hue",
    "type": "primary"
}
{
    "color": "white",
    "category": "value",
    "type": "idk"
}
{
    "color": "red",
    "category": "hue",
    "type": "primary"
}

Except it has over 10,000 entries formatted this way. I have to access the color and type for each separate part and create a string for each that says "type color" like "primary black".

I wanted to edit the file using Python to look like this:

[
    {
        "color": "black",
        "category": "hue",
        "type": "primary"
    },
    {
        "color": "white",
        "category": "value",
        "type": "idk"
    },
    {
        "color": "red",
        "category": "hue",
        "type": "primary"
    },
]

So I can access the values with [] and use a for-loop for all of them. How would I do this? Is there a better way to accomplish this than editing the json file?

Tahirah A
  • 1
  • 1

2 Answers2

0

Either a file is JSON or it isn't, and the example file you gave isn't. It's inaccurate and confusing to refer to it as "nonstandard JSON".

That said, here's an outline of how I would use Python to convert that file to truly be JSON:

Open a new file for writing.
Write "[" to the output file.
Open your existing file for reading.
For each line in the file:
    Write that line to the output file.
    If that line is "}", but it is not the last line, also write a comma.
Write ] to the output file.
John Gordon
  • 29,573
  • 7
  • 33
  • 58
  • You don't really need Python for that. `sed '1s/^/[/;$!/}/},/;$s/$/]/' file>file.json` – tripleee Jun 21 '18 at 03:27
  • The above actually has a bug fix to your pseudocode; don't add a comma if this is the last line. – tripleee Jun 21 '18 at 03:28
  • Soo also https://stackoverflow.com/questions/35021524/how-can-i-add-a-comma-at-the-end-of-every-line-except-the-last-line/35021663#35021663 – tripleee Jun 21 '18 at 03:29
  • @tripleee Thanks for the bug spotting! I have edited my answer. – John Gordon Jun 21 '18 at 03:30
  • Thanks, I've never used a JSON file before and the person who gave it said it was JSON ¯\_(ツ)_/¯ but I should have known. @tripleee the link you gave adds a comma to each line so it doesn't work with my code. If there's a way to edit it to only add commas to lines ending with }, that would be very helpful – Tahirah A Jun 21 '18 at 03:47
  • The comment above has a version which adds a comma only to lines with a close brace. The link is just to help you unlerstand what it does. Some people call this JSONS for a "serial" stream of JSON fragments. – tripleee Jun 21 '18 at 04:02
0

You can use str.format

In [341]: with open('f.json', 'r') as f:
     ...:     string = f.read()
     ...:

In [342]: string
Out[342]: '{\n    "color": "black",\n    "category": "hue",\n    "type": "primary"\n}\n{\n    "color": "white",\n    "category": "value",\n    "type": "idk"\n}\n{\n    "color": "red",\n    "category": "hue",\n    "type": "primary"\n}\n'

In [343]: string = string.replace('}','},') # Must split each '{}' with a comma

In [344]: string
Out[344]: '{\n    "color": "black",\n    "category": "hue",\n    "type": "primary"\n},\n{\n    "color": "white",\n    "category": "value",\n    "type": "idk"\n},\n{\n    "color": "red",\n    "category": "hue",\n    "type": "primary"\n},\n'

In [345]: string = string[:-2] # Get rid of the trailing comma

In [346]: string
Out[346]: '{\n    "color": "black",\n    "category": "hue",\n    "type": "primary"\n},\n{\n    "color": "white",\n    "category": "value",\n    "type": "idk"\n},\n{\n    "color": "red",\n    "category": "hue",\n    "type": "primary"\n}'

In [347]: json.loads('[{0}]'.format(string))
Out[347]:
[{'color': 'black', 'category': 'hue', 'type': 'primary'},
 {'color': 'white', 'category': 'value', 'type': 'idk'},
 {'color': 'red', 'category': 'hue', 'type': 'primary'}]
aydow
  • 3,673
  • 2
  • 23
  • 40