-1

I'm attempting to convert a JSON file to an SQLite or CSV file so that I can manipulate the data with python. Here is where the data is housed: JSON File.

I found a few converters online, but those couldn't handle the quite large JSON file I was working with. I tried using a python module called sqlbiter but again, like the others, was never really able to output or convert the file.

I'm not. sure where to go now, if anyone has any recommendations or insights on how to get this data into a database, I'd really appreciate it.

Thanks in advance!

EDIT: I'm not looking for anyone to do it for me, I just need to be pointed in the right direction. Are there other methods I haven't tried that I could learn?

grigs
  • 1,140
  • 3
  • 15
  • 28
  • 2
    Why not write a script to do it yourself? – Stephen Feb 07 '18 at 14:51
  • I don't know how to write a script that can convert this myself. If you have any literature or can link me to a walkthrough, I could attempt it. – grigs Feb 07 '18 at 14:52
  • I think you are looking for something like `data = json.loads(jsonfile)`. – sobek Feb 07 '18 at 14:56
  • 1
    You dont need to convert json files to anything to parse them in Python. All you have to do is [load the file contents into a dictionary object](https://stackoverflow.com/questions/20199126/reading-json-from-a-file) and work from there. – BoboDarph Feb 07 '18 at 14:58

2 Answers2

1

You can utilize pandas module for this data processing task as follows:

  1. First, you need to read the JSON file using with, open and json.load.

  2. Second, you need to change the format of your file a bit by changing the large dictionary that has a main key for every airport into a list of dictionaries instead.

  3. Third, you can now utilize some pandas magic to convert your list of dictionaries into a DataFrame using pd.DataFrame(data=list_of_dicts).

  4. Finally, you can utilize pandas's to_csv function to write your DataFrame as a CSV file into disk.

It would look something like this:

import pandas as pd
import json
with open('./airports.json.txt','r') as f:
    j = json.load(f)
l = list(j.values())
df = pd.DataFrame(data=l)
df.to_csv('./airports.csv', index=False)
O.Suleiman
  • 898
  • 1
  • 6
  • 11
  • Thank you! This is fantastic. I was working with this: `from pandas.io.json import json_normalize import json import sqlite3 sqlite_file = 'file.sqlite3' conn = sqlite3.connect(sqlite_file) c = conn.cursor() with open('airports.json') as data_file: d = json.load(data_file) df = json_normalize(d) df.to_sql('airports', conn, if_exists="replace") print(df)` but couldnt get the data to present as rows as opposed to columns. Thanks again, this is fantastic. – grigs Feb 07 '18 at 15:52
  • No problem, I think you should've added this code to your original question. I am not sure if json_normalize would work in your case since you won't be able to access the inner dictionaries using 1 key because the top level of the JSON file contains multiple keys which are the 'icao' of the airports. – O.Suleiman Feb 08 '18 at 07:32
  • Noted, I'll make sure to add more code to my question next time. And good to know regarding inner dicts. Thanks again – grigs Feb 08 '18 at 14:56
0

You need to load your json file and parse it to have all the fields available, or load the contents to a dictionary, then you could using pyodbc to write to the database these fields, or write them to the csv if you use import csv first. But this is just a general idea. You need to study python and how to do every step.

For instance for writting to the database you could do something like: for i in range(0,max_len): sql_order = "UPDATE MYTABLE SET MYTABLE.MYFIELD ...." cursor1.execute(sql_order) cursor1.commit()

J_P
  • 761
  • 8
  • 17