0

I have data formatted in .json file. The end goal is to reformat the data to sqlite table and store into a database for further analysis.

Here is a sample of the data:

{"_id":{"$oid":"60551"},"barcode":"511111019862","category":"Baking","categoryCode":"BAKING","cpg":{"$id":{"$oid":"601ac114be37ce2ead437550"},"$ref":"Cogs"},"name":"test brand @1612366101024","topBrand":false}
{"_id":{"$oid":"601c5460be37ce2ead43755f"},"barcode":"511111519928","brandCode":"STARBUCKS","category":"Beverages","categoryCode":"BEVERAGES","cpg":{"$id":{"$oid":"5332f5fbe4b03c9a25efd0ba"},"$ref":"Cogs"},"name":"Starbucks","topBrand":false}
{"_id":{"$oid":"601ac142be37ce2ead43755d"},"barcode":"511111819905","brandCode":"TEST BRANDCODE @1612366146176","category":"Baking","categoryCode":"BAKING","cpg":{"$id":{"$oid":"601ac142be37ce2ead437559"},"$ref":"Cogs"},"name":"test brand @1612366146176","topBrand":false}
{"_id":{"$oid":"601ac142be37ce2ead43755a"},"barcode":"511111519874","brandCode":"TEST BRANDCODE @1612366146051","category":"Baking","categoryCode":"BAKING","cpg":{"$id":{"$oid":"601ac142be37ce2ead437559"},"$ref":"Cogs"},"name":"test brand @1612366146051","topBrand":false}

Followed by the code:

import pandas as pd
import json
import sqlite3

# Open json file and convert to a list
with open("users.json") as f:
    dat = [json.loads(line.strip()) for line in f]

# create a datafrom from json file
df = pd.DataFrame(dat)

#open database connection
con = sqlite3.connect("fetch_rewards.db")
c = con.cursor()

df.to_sql("users", con)

c.close()

The error I am getting:

Traceback (most recent call last):
  File "C:\Users\mohammed.alabbas\Desktop\sqlite\import_csv.py", line 16, in <module>
    df.to_sql("users", con)
  File "C:\Users\name\AppData\Roaming\Python\Python39\site-packages\pandas\core\generic.py", line 2605, in to_sql
    sql.to_sql(
  File "C:\Users\name\AppData\Roaming\Python\Python39\site-packages\pandas\io\sql.py", line 589, in to_sql
    pandas_sql.to_sql(
  File "C:\Users\name\AppData\Roaming\Python\Python39\site-packages\pandas\io\sql.py", line 1828, in to_sql
    table.insert(chunksize, method)
  File "C:\Users\mname\AppData\Roaming\Python\Python39\site-packages\pandas\io\sql.py", line 830, in insert
    exec_insert(conn, keys, chunk_iter)
  File "C:\Users\mname\AppData\Roaming\Python\Python39\site-packages\pandas\io\sql.py", line 1555, in _execute_insert
    conn.executemany(self.insert_statement(num_rows=1), data_list)
sqlite3.InterfaceError: Error binding parameter 1 - probably unsupported type.

Thanks in advance

Barmar
  • 741,623
  • 53
  • 500
  • 612
malaba
  • 13
  • 3
  • Your dictionaries have nested dictionaries, and `df.to_sql()` is trying to store a dictionary as a table column value. This isn't supported. – Barmar Oct 01 '21 at 14:33
  • 1
    You need to flatten the dictionary before converting it to a dataframe row. – Barmar Oct 01 '21 at 14:33
  • @Barmar do you have any recommendation on how to approach this? – malaba Oct 01 '21 at 16:43
  • Maybe this: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame – Barmar Oct 01 '21 at 16:44

0 Answers0