-1

I am brand new to using JSON data and fairly new to Python. I am struggling with being able to parse the following JSON data in Python, in order to import the data into a SQL Server database. I already have a program that will import the parsed data into sql server using PYDOBC, however I can't for the life of me figure out how to correctly parse the JSON data into a Python dictionary.

I know there are a number of threads that address this issue, however I was unable to find any examples of the same JSON data structure. Any help would be greatly appreciated as I am completely stuck on this issue. Thank you SO! Below is a cut of the JSON data I am working with:

{
 "data": 
 [
  {
    "name": "Mobile Application",
    "url": "https://www.example-url.com",
    "metric": "users",
    "package": "example_pkg",
    "country": "USA",
    "data": [  
      [ 1396137600000, 5.76 ],
      [ 1396224000000, 5.79 ],
      [ 1396310400000, 6.72 ],
      ....
      [ 1487376000000, 7.15 ]
    ]
  }
],"as_of":"2017-01-22"}

Again, I apologize if this thread is repetitive, however as I mentioned above, I was not able to work out the logic from other threads as I am brand new to using JSON.

Thank you again for any help or advice in regard to this.

import json

with open("C:\\Pathyway\\7Park.json") as json_file:
data = json.load(json_file)
assert data["data"][0]["metric"] == "users"

The above code results with the following error:

Traceback (most recent call last):
  File "JSONpy", line 10, in <module>
    data = json.load(json_file)
  File "C:\json\__init__.py", line 291, in load
    **kw)
  File "C:\json\__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "C:\json\decoder.py", line 367, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 7 column 1 (char 23549 - 146249)
ls101
  • 177
  • 5
  • 17
  • 2
    "*how to correctly parse the JSON data into a Python dictionary.*" -- What have you tried? Have you tried `import json` / `json.load(open('file.json'))` ? – Robᵩ Mar 08 '17 at 18:02
  • I have tried a number of things from json.load(open(file.json)) to JSON_data = [json.loads(line) for line in open(JSON)] – ls101 Mar 08 '17 at 18:06
  • 1
    Was [this](http://stackoverflow.com/questions/2835559/parsing-values-from-a-json-file-using-python) not helpful? If so, provide your code which has a problem and explain which part caused error. – Sangbok Lee Mar 08 '17 at 18:07
  • I have also tried using the ijson package as well. I have been getting a lot of different errors, I think the main issue I'm facing is that I am not writing the Python code correctly in relation to the JSON data structure. – ls101 Mar 08 '17 at 18:08
  • What is your actual error? What doesn't work? – xandermonkey Mar 08 '17 at 18:10
  • 1
    Please [edit] your question to provide a sample program. Please provide a short, complete program that demonstrates the error you see. Please include all output of the program, including the full text of any error message. Please describe what you expect the output to be. See [mcve] for more info. – Robᵩ Mar 08 '17 at 18:11
  • Please do not post screen shots. Please copy-paste the error message into your question. See [mcve] for more info. – Robᵩ Mar 08 '17 at 18:25
  • Your sample JSON data is a more-or-less valid JSON document. Your actual data is not. Is it possible that your actual file is a sequence of JSON documents, one per line? – Robᵩ Mar 08 '17 at 18:27
  • My apologies Rob, I pasted the error into the original thread. And yes, I believe you are correct.. the JSON file I have has six lines. What I pasted in the original question is only one of the six lines. Maybe this is the reason I am having issues. Again, sorry for the stupid questions / amateur mistakes, I really appreciate the assistance. – ls101 Mar 08 '17 at 18:30
  • No worries. If you really do have one JSON document per line, then this question: http://stackoverflow.com/questions/21058935/python-json-loads-shows-valueerror-extra-data might help you. – Robᵩ Mar 08 '17 at 18:35

1 Answers1

1

Assuming the data you've described (less the ... ellipsis) is in a file called j.json, this code parses the JSON document into a Python object:

import json

with open("j.json") as json_file:
    data = json.load(json_file)

assert data["data"][0]["metric"] == "users"

From your error message it seems possible that your file is not a single JSON document, but a sequence of JSON documents separated by newlines. If that is the case, then this code might be more helpful:

import json

with open("j.json") as json_file:
    for line in json_file:
        data = json.loads(line)
        print (data["data"][0]["metric"])
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • Rob, thank you so much. This is essentially the last code I tried before posting this thread. The error I receive from this code is the following (I will also edit my original question and add a screen shot of the full error as it appears in the command prompt). Error = "ValueError: Extra data: line 2 column 1 - line 7 column 1 (char 23549 - 146249) – ls101 Mar 08 '17 at 18:19
  • I have attached a screen shot of the exact error message when I run the code you provided. – ls101 Mar 08 '17 at 18:21
  • That worked perfectly, thanks again Rob, that was a huge help. – ls101 Mar 08 '17 at 18:40
  • Rob, one more quick question, if it is not too much trouble. Is it now possible to load each of these JSON files into dictionaries so as to be exported into a SQL Server database? If so, which Python module would be the best to use. Thanks again for your help. – ls101 Mar 08 '17 at 18:53
  • Sorry, I don't know much about exporting to SQL Server databases. But yes, it is possible to load these files (they are not JSON files -- a JSON file would have a single JSON document in it.). To create a single `dict` from such a file, I might do: `for i, line in enumerate(json_file): result_dict[i] = json.loads(line)`. – Robᵩ Mar 08 '17 at 19:40
  • OK thanks again. And sorry, one actually final question. After using your code from above, I found that the only "columns" in the new files are "data" and "as of". How would I get the "name", "metric", and "country" nested fields to be recognized as columns? – ls101 Mar 08 '17 at 20:05
  • Sorry, I don't know anything about SQL Server. I don't know how it defines 'columns.' – Robᵩ Mar 09 '17 at 15:29