0

Goal: Merge JSON files into one big file

Background: I am using the code below taken from here Issue with merging multiple JSON files in Python

import json
import glob

result = []
for f in glob.glob("/Users/EER/Desktop/JSON_Combo/*.json"):
    with open(f, "rb") as infile:
        result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

However, I get the following error:

JSONDecodeError: Extra data: line 2 column 1 (char 5733)

I checked Python json.loads shows ValueError: Extra data and JSONDecodeError: Extra data: line 1 column 228 (char 227) and ValueError: Extra Data error when importing json file using python but they are a bit different. A potential reason for the error seems to be that my .json files are a list of strings but I am not sure

Question: Any thoughts on how to fix this error?

  • It sounds like one of your files is not valid JSON. I would recommend putting in a `try ... except...` and printing the file name in the `except` block to see which one is bad. – hoyland Mar 03 '18 at 17:17

1 Answers1

0

There is an invalid JSON file in your files, found out which one caused it by catching the error with try except

import json
import glob

result = []
for f in glob.glob("/Users/EER/Desktop/JSON_Combo/*.json"):
    with open(f, "rb") as infile:
        try:
            result.append(json.load(infile))
        except ValueError:
            print(f)

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)
Gabriel B.R
  • 278
  • 1
  • 7
  • I tried the code above and I get the follow error `NameError: name 'JSONDecodeError' is not defined` –  Mar 03 '18 at 17:25
  • thanks for the update. I re-ran the code and now I get the following `TypeError: a bytes-like object is required, not 'str`' –  Mar 03 '18 at 17:32
  • @EER Try delete the `ValueError` and make it just `except:` – Gabriel B.R Mar 03 '18 at 17:35
  • I did so. Now I get `TypeError: a bytes-like object is required, not 'str'` –  Mar 03 '18 at 19:02
  • it seems to not like the `json.dump(result, outfile)` line of code –  Mar 03 '18 at 19:03
  • I tried changing the `"rb"` to `r` as suggested on SO https://stackoverflow.com/questions/33054527/python-3-5-typeerror-a-bytes-like-object-is-required-not-str-when-writing-t but still get the same error –  Mar 03 '18 at 19:06
  • I think the problem is that my files aren't real json files - rather, each line of the file contains a json object. –  Mar 03 '18 at 21:30