I have the following JSON file:
{
"date": "2021-07-08",
"info": "success",
"data": [
{
"customerid": "1",
"site1": ["2021-07-08T20:56:16+02:00"]
},
{
"customerid": "2",
"isvip": true,
"site1": ["2021-07-08T18:53:53+02:00"]
},
{
"customerid": "3",
"isvip": true,
"site1": ["2021-07-08T13:29:04+02:00", "2021-07-08T13:29:02+02:00"],
"site2": [ "2021-07-08T05:15:12+02:00", "2021-07-08T05:15:08+02:00"]
}
]
}
And I would like to transform it into a SQL table for our database like this:
| customerid | isvip | siteid | timestamp |
| ---------- | ----------- | ------ | ------------------- |
| 1 | false | site1 | 2021-07-08 20:56:16 |
| 2 | true | site1 | 2021-07-08 18:53:53 |
| 3 | true | site1 | 2021-07-08 13:29:04 |
| 3 | true | site1 | 2021-07-08 13:29:02 |
| 3 | true | site2 | 2021-07-08 05:15:12 |
| 3 | true | site2 | 2021-07-08 05:15:08 |
There are several problems that I have to face:
1 - I don't care about the date and info in the beginning. They can be eliminated.
2 - The info about isvip only shows up if the value is true. Otherwise, it's false.
3 - Several sites can be visited. But they only appear if they actually have been visited. For example, customer 3 is the only one who visited site2.
4 - (optional) the isvip-field is metadata about the ids which could be put in a separate table. But it's just a binary field, so the use of storage capacity is not a big deal yet.
I haven't worked with JSON files that much yet. I would like to do the code either in SQL or Python (or both).
Thank you guys for your help!!!