9

I am simply trying to read my json file in Python. I am in the correct folder when I do so; I am in Downloads, and my file is called 'Books_5.json'. However, when I try to use the .read() function, I get the error

OSError: [Errno 22] Invalid argument

This is my code:

import json
config = json.loads(open('Books_5.json').read())

This also raises the same error:

books = open('Books_5.json').read()

If it helps, this is a small snippet of what my data looks like:

{"reviewerID": "A10000012B7CGYKOMPQ4L", "asin": "000100039X", "reviewerName": "Adam", "helpful": [0, 0], "reviewText": "Spiritually and mentally inspiring! A book that allows you to question your morals and will help you discover who you really are!", "overall": 5.0, "summary": "Wonderful!", "unixReviewTime": 1355616000, "reviewTime": "12 16, 2012"}
{"reviewerID": "A2S166WSCFIFP5", "asin": "000100039X", "reviewerName": "adead_poet@hotmail.com \"adead_poet@hotmail.com\"", "helpful": [0, 2], "reviewText": "This is one my must have books. It is a masterpiece of spirituality. I'll be the first to admit, its literary quality isn't much. It is rather simplistically written, but the message behind it is so powerful that you have to read it. It will take you to enlightenment.", "overall": 5.0, "summary": "close to god", "unixReviewTime": 1071100800, "reviewTime": "12 11, 2003"}

I'm using Python 3.6 on MacOSX

user45254
  • 362
  • 3
  • 13

3 Answers3

15

It appears that this is some kind of bug that occurs when the file is too large (my file was ~10GB). Once I use split to break up the file by 200 k lines, the .read() error goes away. This is true even if the file is not in strict json format.

user45254
  • 362
  • 3
  • 13
  • 2
    I'm facing the same problem while opening a 8 GB json file. Could you please share how you broke up the file into 200k lines? I don't seem to get how a JSON file will be broken down into smaller files with `split`. – Box Box Box Box Aug 19 '18 at 12:24
  • facing the same problem with mac version 10.13.3 and the file size is around 10GB. – Daniel Aug 20 '18 at 17:32
  • 1
    alternative try loading the json as stream using [ijson](https://pypi.org/project/ijson/) – Alon Eirew Sep 05 '18 at 12:27
0

Your code looks fine, it just looks like your json data is formatted incorrectly. Try the following. As others have suggested, it should be in the form [{},{},...].

[{"reviewerID": "A10000012B7CGYKOMPQ4L", "asin": "000100039X", 
"reviewerName": "Adam", "helpful": [0, 0], "reviewText": "Spiritually and 
mentally inspiring! A book that allows you to question your morals and will 
help you discover who you really are!", "overall": 5.0, "summary": 
"Wonderful!", "unixReviewTime": 1355616000, "reviewTime": "12 16, 2012"},
{"reviewerID": "A2S166WSCFIFP5", "asin": "000100039X", "reviewerName": 
"adead_poet@hotmail.com \"adead_poet@hotmail.com\"", "helpful": [0, 2], 
"reviewText": "This is one my must have books. It is a masterpiece of 
spirituality. I'll be the first to admit, its literary quality isn't much. 
It is rather simplistically written, but the message behind it is so 
powerful that you have to read it. It will take you to enlightenment.", 
"overall": 5.0, "summary": "close to god", "unixReviewTime": 1071100800, 
"reviewTime": "12 11, 2003"}]

Your code and this data worked for me on Windows 7 and python 2.7. Different than your setup, but should still be ok.

Anddrrw
  • 82
  • 5
  • Yes that does work. Do you happen to have any quick ways of converting the data? It is a very large file. I was thinking of inserting a comma following every `}`, except the last one, and then adding [ at the beginning and ] at the end. What is the pythonic way, if it is off the top of your head? Otherwise I will just figure it out . Thank you for your help. – user45254 May 01 '17 at 18:01
  • Are they all on separate lines like in your example? If they are, just read the file one line at a time (see [`readline()`](https://docs.python.org/2/tutorial/inputoutput.html)) and append each line to a list. – Anddrrw May 01 '17 at 18:14
  • I think so, I'll test later. Thanks for your help. – user45254 May 01 '17 at 18:24
  • that did not work either. But it turns out the problem was that the file was too large. See my answer below – user45254 May 02 '17 at 20:50
  • https://meta.stackoverflow.com/questions/277923/are-your-code-works-fine-for-me-answers-acceptable – Jean-François Fabre Feb 09 '18 at 20:39
-1

In order to read json file, you can use next example:

with open('your_data.json') as data_file:    
    data = json.load(data_file)

print(data)
print(data[0]['your_key']) # get value via key.

and also try to convert your json objects into a list

[
  {'reviewerID': "A10000012B7CGYKOMPQ4L", ....},
  {'asin': '000100039X', .....}
]
  • 1
    Believe it or not, I still have the same error here: `OSError: [Errno 22] Invalid argument` – user45254 May 01 '17 at 17:44
  • Maybe @DanilaGanchar is right-- if the data is in the form `{} {} {}`, not `[{},{},{}]`, is this the error that results? – user45254 May 01 '17 at 17:45
  • Yes, you need to convert your `json` objects, into a list. And then everything should work fine. –  May 01 '17 at 17:53