1

I would like to read a large JSON file that I have before created through some web-scraping. However, when I try to read in the file, I get the following error message:

JSONDecodeError: Expecting ',' delimiter: line 1364567 column 2 (char 1083603504)

However, 1364567 is the very last line and it seems to be correct right there. Therefore I expect that the error is somewhere else in the file before, for example that somewhere there are brackets that are opened but not closed. What do you suggest how I can track down the problem and fix it? I can also provide a link to the file, but it is quite large (1.05 GB).

I use the following code to read the json file

import json

with open("file.json") as f:
    data = json.load(f)

Thank you very much!

Edit: The problem was solved as follows: The end of the JSON file looked normal, i.e. an additional line with fields and information and a closing bracket ]. json.load complained about a missing comma, i.e. not recognizing that the last bracket indicated indeed that the file ended. Therefore there must have been opening brackets [ before in the file, that were not closed. Luckily those were due to some hiccups with the scraping at the beginning of the file, such that some manual search with Sublime Text allowed me just to delete those opening brackets and read the file without problems. Anyways, thank you very much for your suggestions and I am sure I will use them the next time I have a problem with JSON!

LyxUser12345
  • 401
  • 4
  • 12
  • You can use some online JSON validators and formatters ti check if JSON is valid or not, but since your JSON file is very large, I think it will take some time to format it. – Pratik Oct 16 '19 at 08:06
  • Is there maybe some offline-tool to do the same? – LyxUser12345 Oct 16 '19 at 08:11
  • I recommend you to try a json validator tool. I only now online tools for smaller files, but you can probably get some good options out there. – powerPixie Oct 16 '19 at 08:12
  • Most IDEs has pluggins to format code, for example VS Code – Pratik Oct 16 '19 at 08:13
  • 1
    You can give jq (https://stedolan.github.io/jq/) a try if you're open to the idea of running something over the terminal. – fixatd Oct 16 '19 at 08:15

3 Answers3

1

You can Use any powerful IDE such as pycharm, Atom, sublime they each have plugins for json formatting

and you can always validate json using online tools but it would be heavy for them to process

Hope this information might help

0

You can use this to check your json format before running codes. Just to make sure where the problem is and fix it https://jsonformatter.curiousconcept.com/

Quang Võ
  • 121
  • 1
  • 9
0

Since you want the last line, and you are using Python, one of the good solutions could be to actually read the last line(s) and print them, to see where the problem is.

For that, there is actually a module you can use, file_read_backwards, which does this efficiently.

For details see this SO answer: https://stackoverflow.com/a/41418158/50003

Gnudiff
  • 4,297
  • 1
  • 24
  • 25