0

I would like to parse some JSON into a SQL INSERT however, due to different pieces of the data lying at different levels its making it difficult to get it all.

Here is the JSON file:

{
"company_number":"01234567",
"data":{
    "address":{
        "address_line_1":"Fake street",
        "country":"England",
        "locality":"TRIAL",
        "postal_code":"### ###",
        "premises":"#"
    },
    "name":"Testing Testing",
    "name_elements":{
        "forename":"Test",
        "middle_name":"Testing",
        "surname":"Testing2",
        "title":"NEW"
    },
    "Nature_of_address":"Agriculture",
    "natures_of_control":["ownership-of-shares-50-to-75-percent"],
    "notified_on":"2016-04-06"
    }
} 

This is the Python i have so far which kinda works:

import os
import json

TABLE_NAME = "Holdingtbl"
sqlstatement = ''

test = []

#with open (Basenames,'r') as f:
#with open (json.dumps(x.to_table(), indent=4),'r') as f:
#    jsondata = json.loads(f.read())

#    print(jsondata)

for line in open('C:\\Users\\[Name]\\Desktop\\Test.json','r',encoding="utf-8"):
    test.append(json.loads(line))
    #print(test)

for json in test:
    keylist = "("
    valuelist = "("
    firstPair = True
    for key, value in json.items():
        if not firstPair:
            keylist += ", "
            valuelist += ", "
        firstPair = False
        keylist += key
        if type(value) in (str, "utf-8"):
            valuelist += "'" + value + "'"
        else:
            valuelist += str(value)
    keylist += ")"
    valuelist += ")"

    sqlstatement += "INSERT INTO " + TABLE_NAME + " " + keylist + " VALUES " + valuelist + "\n"

print(sqlstatement)

And the result is:

INSERT INTO Holdingtbl (company_number, data) VALUES ('01234567', {'address': {'address_line_1': 'Fake street, 'country': 'England', 'locality': 'TRIAL', 'postal_code': '### ###', 'premises': ‘#'}, 'name': 'Testing Testing', 'name_elements': {'forename': 'Test', 'middle_name': 'Testing', 'surname': 'Testing2', 'title': 'New'},'Nature_of_address': 'Agriculture' ,'natures of control’: ['ownership-of-shares-50-to-75-percent'], 'notified_on': '2016-04-06'})

Which is kinda right but i would need the the insert look like:

INSERT INTO Holdingtbl (Company_number, address_line_1, country, locality, postal_code, premises, name, forename, middle_name, surname, title, nature_of_address, natures_of_control, notified_on) VALUES('01234567', 'Fake street', 'England', 'Trial', '### ###', '#', 'Testing Testing', 'Test', 'Testing', 'Testing2', 'New', 'Agriculture', 'ownership-of-shares-50-to-75-percent', '2016-01-06');

I have seen some examples: Example1

But i havent been able to get it to work for me.

@Skaul05

What i think you mean is this: But its causing an error: AttributeError: 'dict' object has no attribute 'iteritems'

import os
import json

TABLE_NAME = "Holdingtbl"
sqlstatement = ''

test = []
value = []
key = []
def get_value(json):
    for i,j in json.iteritems():
        if type(j) in (str, "utf-8"):
            key.append(i)
            value.append(j)
        elif type(j) == list:
            key.append(i)
            value.append(j[0])
        elif type(j) == dict:
            get_value(j)

#with open (Basenames,'r') as f:
#with open (json.dumps(x.to_table(), indent=4),'r') as f:
#    jsondata = json.loads(f.read())

#    print(jsondata)

for line in open(''C:\\Users\\[Name]\\Desktop\\Test.json'','r',encoding="utf-8"):
    test.append(json.loads(line))
    #print(test)

for json in test:
    get_value(json)

    sqlstatement += "INSERT INTO " + TABLE_NAME + " " + keylist + " VALUES " + valuelist + "\n"

print(sqlstatement)

this is the output if I loop over more than one json line:

        INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30');
        INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30');
    INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on,company_number,address_line_1,address_line_2,locality,postal_code,region,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30','01605766','Wates House','Station Approach','Leatherhead','KT22 7SW','Surrey','England','16c4017adbe1919073fef3fad7535841299de14c','individual-person-with-significant-control','/company/01605766/persons-with-significant-control/individual/Z6tt_4IDQGaS5MU_wOCCCOj4zyY','Mr Jeremy Wyckham Wright','Jeremy','Wyckham','Wright','Mr','British','significant-influence-or-control','2016-06-29');
        INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30');

    INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on,company_number,address_line_1,address_line_2,locality,postal_code,region,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30','01605766','Wates House','Station Approach','Leatherhead','KT227SW','Surrey','England','16c4017adbe1919073fef3fad7535841299de14c','individual-person-with-significant-control','/company/01605766/persons-with-significant-control/individual/Z6tt_4IDQGaS5MU_wOCCCOj4zyY','Mr Jeremy Wyckham Wright','Jeremy','Wyckham','Wright','Mr','British','significant-influence-or-control','2016-06-29');
INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on,company_number,address_line_1,address_line_2,locality,postal_code,region,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on,company_number,address_line_1,country,locality,postal_code,premises,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30','01605766','Wates House','Station Approach','Leatherhead','KT22 7SW','Surrey','England','16c4017adbe1919073fef3fad7535841299de14c','individual-person-with-significant-control','/company/01605766/persons-with-significant-control/individual/Z6tt_4IDQGaS5MU_wOCCCOj4zyY','Mr Jeremy Wyckham Wright','Jeremy','Wyckham','Wright','Mr','British','significant-influence-or-control','2016-06-29','10259080','College Avenue','United Kingdom','Oldham','OL8 4DX','54','United Kingdom','231a5a1071ef608f60a79a0fce2c3e21e29f00b9','individual-person-with-significant-control','/company/10259080/persons-with-significant-control/individual/6Fkld0qaXyjm4FhJMsmUFKAJU4I','Dr Muhammad Fayaz Khan','Muhammad','Fayaz','Khan','Dr','British','ownership-of-shares-25-to-50-percent','2016-06-30');

I was expecting:

INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,country,locality,postal_code,premises,region,ceased_on,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('08581893','High Street','Wendover','England','Aylesbury','HP22 6EA','14a','Buckinghamshire','2016-07-01','England','d55168c49f85ab1ef38a12ed76238d68f79f5a01','individual-person-with-significant-control','/company/08581893/persons-with-significant-control/individual/-6HQmkhiomEBXJI2rgHccU67fpM','Mr Quentin Colin Maxwell Solt','Quentin','Colin Maxwell','Solt','Mr','British','ownership-of-shares-25-to-50-percent','2016-06-30');
INSERT INTO Holdingtbl (company_number,address_line_1,address_line_2,locality,postal_code,region,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('01605766','Wates House','Station Approach','Leatherhead','KT22 7SW','Surrey','England','16c4017adbe1919073fef3fad7535841299de14c','individual-person-with-significant-control','/company/01605766/persons-with-significant-control/individual/Z6tt_4IDQGaS5MU_wOCCCOj4zyY','Mr Jeremy Wyckham Wright','Jeremy','Wyckham','Wright','Mr','British','significant-influence-or-control','2016-06-29');
INSERT INTO Holdingtbl (company_number,address_line_1,country,locality,postal_code,premises,country_of_residence,etag,kind,self,name,forename,middle_name,surname,title,nationality,natures_of_control,notified_on) VALUES ('10259080','College Avenue','United Kingdom','Oldham','OL8 4DX','54','United Kingdom','231a5a1071ef608f60a79a0fce2c3e21e29f00b9','individual-person-with-significant-control','/company/10259080/persons-with-significant-control/individual/6Fkld0qaXyjm4FhJMsmUFKAJU4I','Dr Muhammad Fayaz Khan','Muhammad','Fayaz','Khan','Dr','British','ownership-of-shares-25-to-50-percent','2016-06-30');
Besarion
  • 139
  • 11
  • If it helps the structure of the file shouldn't change. I get a huge list of JSON lines which i need to iterate over to create insert statements for. – Besarion Nov 07 '19 at 07:00
  • Am I correct that you want to flatten the json the get individual fields into the database? If so then you need to read the json as dict, parse the fields that you require to enter into the database by creating the SQL as you mentioned that you would need. You cannot feed list and dicts to the SQL insert. – Bruno Vermeulen Nov 07 '19 at 07:10
  • Yes i want to flatten the Json file. Is that something like: for line in open('C:\\Users\\[Name]\\Desktop\\Test.json','r',encoding="utf-8"): test.append(json.loads(line)) print test['data'] print test['data'][1]['address'][0]['Address_line_1'] – Besarion Nov 07 '19 at 07:18

1 Answers1

2

Recursion could be the best-suited solution for this situation.

Following function will give you list of keys and values for your sql statement upon passing the suitable JSON.

value = []
key = []
def get_value(json):
    for i,j in json.items():
        if type(j) in (str, "utf-8"):
            key.append(i)
            value.append(j)
        elif type(j) == list:
            key.append(i)
            value.append(j[0])
        elif type(j) == dict:
            get_value(j)

You can then pass it straight into your sql statement

sqlstatement +=  "INSERT INTO " + TABLE_NAME + " " + "(" + ",".join(key) + ")" + " VALUES " + "(" + ",".join(value) + ")" + ";"

Output:

INSERT INTO Holdingtbl (Nature_of_address,name,natures_of_control,surname,title,middle_name,forename,notified_on,premises,country,locality,postal_code,address_line_1,company_number) VALUES (Agriculture,Testing Testing,ownership-of-shares-50-to-75-percent,Testing2,NEW,Testing,Test,2016-04-06,#,England,TRIAL,### ###,Fake street,01234567);

Hope this answers your question!!!

skaul05
  • 2,154
  • 3
  • 16
  • 26
  • What would i make json equal to? If i try get_value(json.loads(line)) or get_value(test) i get the following errors: AttributeError: 'list' object has no attribute 'iteritems', AttributeError: 'dict' object has no attribute 'iteritems' – Besarion Nov 07 '19 at 07:35
  • okay @Besarion iterate over the list and pass each item into it. So the JSON u mentioned in your question pass that into get_value. – skaul05 Nov 07 '19 at 07:36
  • Do you mean iterate at this point: for json in test or do you mean iterate at: for key, value in json.items(): Apologise in advance my python isnt great. – Besarion Nov 07 '19 at 08:39
  • for json in test @Besarion and pass json into function – skaul05 Nov 07 '19 at 09:11
  • Like the above? I added to the original bit i posted above – Besarion Nov 07 '19 at 09:18
  • Yes exactly @Besarion – skaul05 Nov 07 '19 at 09:19
  • @Besarion I think you are running your code in Python 3. I have updated my code. You just need to replace `iteritems()` with `items()` – skaul05 Nov 07 '19 at 09:22
  • That has run now, yes it was python 3 sorry I didnt make that clear however, the output is still: INSERT INTO Holdingtbl (company_number, data) – Besarion Nov 07 '19 at 09:25
  • Thank you I am a div some times. That you for you time. – Besarion Nov 07 '19 at 09:27
  • So what i think the issue is that its to do with the sqlstatement +=. If i print key and value they line up nicely and it all looks great its only when we get to the sqlstatement it all becomes mangled and so maybe something to do with the .join? – Besarion Nov 07 '19 at 13:28