24

Sample data:

{
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
}

{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}

I tried:

import pandas as pd
data= pd.read_json('Data.json')

getting error ValueError: Expected object or value

also

import json
with open('gdb.json') as datafile:
    data = json.load(datafile)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 509)

with open('gdb.json') as datafile:
for line in datafile:
    data = json.loads(line)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 1 column 577 (char 576)

How to read this json into pandas

pirho
  • 11,565
  • 12
  • 43
  • 70
adimohankv
  • 241
  • 1
  • 2
  • 5
  • Are you able to fix it. – Gaurav Mishra Jun 05 '17 at 21:49
  • no for the time being i converted it into .csv file. – adimohankv Jun 07 '17 at 02:25
  • Look for this-https://stackoverflow.com/questions/27046593/parsing-comma-separated-json-from-a-file – Gaurav Mishra Jun 07 '17 at 16:16
  • Do any of the answers below answer your question? If so, please select the best one. – Steven Jan 19 '20 at 16:00
  • I also just had to load it with `json.load()` and then only read it into the `pd.DataFrame`, using pandas directly does not work, and not because I have some formatting issues like in the question, but in general. My json is an official log download from Google Cloud Platform that was filled with the Python logging module, nothing malformed. It is just a list of dictionaries instead of, what I expected at first, a full dictionary. – questionto42 Feb 25 '22 at 13:24
  • set `lines=False` when each line is not a json object – Fibo Kowalsky Feb 26 '23 at 20:07

18 Answers18

18

Your JSON is malformed.

ValueError: Expected object or value can occur if you mistyped the file name. Does Data.json exist? I noticed for your other attempts you used gdb.json.

Once you confirm the file name is correct, you have to fix your JSON. What you have now is two disconnected records separated by a space. Lists in JSON must be valid arrays inside square brackets and separated by a comma: [{record1}, {record2}, ...]

Also, for pandas you should put your array under a root element called "data":

{ "data": [ {record1}, {record2}, ... ] }

Your JSON should end up looking like this:

{"data":
    [{
        "_id": "OzE5vaa3p7",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "nebCwWd2Fr"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
        "barcode": "8908001921015",
        "isFmcg": true,
        "itemName": "Anil puttu flour 500g",
        "mrp": 58,
        "_created_at": "2016-10-02T13:49:03.281Z",
        "_updated_at": "2017-02-22T08:48:09.548Z"
    }
    ,
    {
        "_id": "ENPCL8ph1p",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "B4nZeUHmVK"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
        "barcode": "8901725181222",
        "isFmcg": true,
        "itemName": "Yippee Magic Masala Noodles, 70 G",
        "mrp": 12,
        "_created_at": "2016-10-02T13:49:03.284Z",
        "_updated_at": "2017-02-22T08:48:09.074Z"
    }]}

Finally, pandas calls this format split orientation, so you have to load it as follows:

df = pd.read_json('gdb.json', orient='split')

df now contains the following data frame:

          _id                                                   categories  isActive                                                     imageUrl        barcode  isFmcg                           itemName  mrp                      _created_at                      _updated_at
0  OzE5vaa3p7  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/shopgro-1376...  8908001921015    True              Anil puttu flour 500g   58 2016-10-02 13:49:03.281000+00:00 2017-02-22 08:48:09.548000+00:00
1  ENPCL8ph1p  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/kirananearby...  8901725181222    True  Yippee Magic Masala Noodles, 70 G   12 2016-10-02 13:49:03.284000+00:00 2017-02-22 08:48:09.074000+00:00

Steven
  • 1,733
  • 2
  • 16
  • 30
16

I got the same error, read the function documentation and play around with different parameters.

I solved it by using the one below,

data= pd.read_json('Data.json', lines=True)

you can try out other things like

data= pd.read_json('Data.json', lines=True, orient='records')

data= pd.read_json('Data.json', orient=str)

rahul
  • 1,133
  • 12
  • 17
7

I encountered this error message today, and in my case the problem was that the encoding of the text file was UTF-8-BOM instead of UTF-8, which is the default for read_json(). This can be solved by specifying the encoding:

data= pd.read_json('Data.json', encoding = 'utf-8-sig')
pieterbons
  • 1,604
  • 1
  • 11
  • 14
  • 2
    I got the same problem too. All the other solutions didn't work for me. Yours did, thanks! – JrmDel Apr 06 '22 at 13:08
6

you should be ensure that the terminal directory is the same with the file directory (when this error occurs for me, because I used vscode, is means for me that the terminal directory in the vscode is not the same with my python file that I want to execute)

Mingming Qiu
  • 333
  • 4
  • 9
4

I faced the same problem the reason behind this is the json file has something that doesn't abide by json rules. In my case i had used single quotes in one of the values instead of double quotes.

enter image description here

crazysra
  • 111
  • 10
2

I dont think this would be the problem as it should be the default (I think). But have you tried this? Adding an 'r' to specify the file is read only.

import json with open('gdb.json', 'r') as datafile: data = json.load(datafile) retail = pd.DataFrame(data)

1

I am not sure if I clearly understood your question, you just trying to read json data ?

I just collected your sample data into list as shown below

[
  {
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
},
{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}
]

and ran this code

import pandas as pd
df = pd.read_json('Data.json')
print(df)

Output:-

              _created_at ... mrp
0 2016-10-02 13:49:03.281 ...  58
1 2016-10-02 13:49:03.284 ...  12

[2 rows x 10 columns]
Shakeel
  • 1,869
  • 15
  • 23
1

make your path easy, it will be helpful to read data. meanwhile, just put your file on your desktop and give that path to read the data. It works.

  • Yes, strange. Even a misspelled filename will cause this error with pd.read_json. Rather than telling you file not found, it gives "ValueError: Expected object or value" – NealWalters Nov 16 '20 at 16:53
1

You can try to change relative path to absolute path For your situation change

import pandas as pd
data= pd.read_json('Data.json')

to

import pandas as pd
data= pd.read_json('C://Data.json')#the absolute path in explore

I got the same error when I run the same code from jupyter notebook to pycharm's jupyter notebook in console

1

If you try the code below, it will solve the problem:

data_set = pd.read_json(r'json_file_address\file_name.json', lines=True)
10 Rep
  • 2,217
  • 7
  • 19
  • 33
1

Another variation, combining tips from the thread that all failed independently but this worked for me:

pd.read_json('file.json', lines=True, encoding = 'utf-8-sig')
John Stud
  • 1,506
  • 23
  • 46
1

If you type in the absolute path of and use \ it should work. At least thats how I fixed the issue

krm73
  • 11
  • 1
0

this worked for me: pd.read_json('./dataset/healthtemp.json', typ="series")

0

every thing is ok except for one thing

in the .json file put the code below:

{
"a": {
    "_id": "OzE5vaa3p7",
    "categories": [
    {
        "__type": "Pointer",
        "className": "Category",
        "objectId": "nebCwWd2Fr"
    }
    ],
    "isActive": true,
    "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
    "barcode": "8908001921015",
    "isFmcg": true,
    "itemName": "Anil puttu flour 500g",
    "mrp": 58,
    "_created_at": "2016-10-02T13:49:03.281Z",
    "_updated_at": "2017-02-22T08:48:09.548Z"
},
"b": {
    "_id": "ENPCL8ph1p",
    "categories": [
    {
        "__type": "Pointer",
        "className": "Category",
        "objectId": "B4nZeUHmVK"
    }
    ],
    "isActive": true,
    "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
    "barcode": "8901725181222",
    "isFmcg": true,
    "itemName": "Yippee Magic Masala Noodles, 70 G",
    "mrp": 12,
    "_created_at": "2016-10-02T13:49:03.284Z",
    "_updated_at": "2017-02-22T08:48:09.074Z"
}
}

Thank you

0

The problem of ValueError: All arrays must be of the same length that happens with

df = pd.read_json (r'./filename.json')#,lines=True)

can be solved by changing the line above to the following.

df = pd.read_json (r'./filename.json',lines=True)
Ftagliacarne
  • 675
  • 8
  • 16
0

I just solved this problem by adding a "/" at the beggining of the absolute path.

import pandas as pd    
pd_from_json = pd.read_json("/home/miguel/folder/information.json")
0

Seems like there's a million things that can cause this. In my case, it was that my json file started had a byte order mark, denoted by [BOM] [unix] in the vim-airline. I don't know what the byte order mark is or when it would be needed. To remove that, in vim, I ran :set nobomb and then saved the file. Then, pandas could read it and I was good to go.

Tiago
  • 1
0

See many times the JSON is in the following format (for those who are still searching for the solution):

Problem


{col1:'val1', col2:'val2'}{col1:'val1', col2:'val2'}{col1:'val1', col2:'val2'}

As you can see we have three issues here:

  1. Keys don't have the double quotes
  2. Values which have quotes but are single
  3. The records are not seperated by comma and return

We will need to replace three things


0. Add the square brackets if not already Add them [ and ] in the beginning of JSON and at the end. Which is just the matter of pressing Home and End keys on your keyboard

1. Replace single quotes with double

import re
# either this (simple)
p = re.compile('(?<!\\\\)\'')

# or this - takes care of quotes in the values
p = re.compile("(?<=:)\s*'(.*?)'\s*(?=,|\n|})")

data = p.sub('\"', data)

Assuming the JSON data is in the string format and stored in the data variable.

2. Provide the double quotes to the keys

data = re.sub(r'(\w+)(?=:)', r'"\1"', data)

3. Give the new line for each record

data = re.sub(r'}\s*{', '},\n{', _data)

Done! Just save the file

with open("ABC.json", "w") as file:
    file.write(data)

Load in pandas

df = pd.read_json(r"./ABC.json")

We are done. We have the clean JSON like this:

[
    {"col1":"val1", "col2":"val2"},
    {"col1":"val1", "col2":"val2"},
    {"col1":"val1", "col2":"val2"}
]
Aayush Shah
  • 381
  • 2
  • 11