I am accessing an HTTP REST API using requests
. The API returns JSON which I would like to parse and then commit to an sqlite3 database. One JSON response might look roughly like the following:
{
entry1:
{
key1:value1,
key2:value2
},
entry2:
{
key2:value2
}
}
Another may look like the following:
{
entry3:
{
key1:value1,
key2:value2,
key3:value3
},
}
I would like to save this to a table as follows:
+--------+--------+--------+--------+
| entry | key1 | key2 | key3 |
+--------+--------+--------+--------+
| entry1 | value1 | value2 | NULL |
+--------+--------+--------+--------+
| entry2 | NULL | value2 | NULL |
+--------+--------+--------+--------+
| entry3 | value1 | value2 | value3 |
+--------+--------+--------+--------+
In reality I am looping over around thousands of entries. My issue is as follows:
I do not know all the possible keys key1
, key2
, key3
... key_n
that can occur, as the API does not document this. It is also too costly for me to do one loop over the entire dataset just to figure out which columns could possibly occur as every request to the API is charged.
There is one obvious solution, namely to check whether the columns to all entries for the current request exist in my table, and adding columns to the table as needed.
My question is twofold:
- Should I perform the above check the above in my python code in a
pandas.DataFrame
OR should I commit directly to mysqlite3
database using SQL commands, thereby bypassing the need to store theDataFrame
in my RAM during runtime? Is it bad practice to add columns like this (i.e using anALTER TABLE
statement) insqlite3
? - Have I overlooked other (potentially better) ways to go about this?