I am Looping through different text files, load text into dictionaries of dictionaries and afterwards connect to a database and insert those texts into a table.
Dictionaries of dictionaries because in every file I have multiple text blocks that contain different parts of text that I need like name, type, content and so on. And because I need meta ordered Information I save the data in those two layers: file level and text block level
This is my Code - first a string to prepare the DB INSERT:
table_q = 'zz_querys'
query_id = 'query_name'
type_col = 'query_type'
descr_col = 'query_descr'
code_col = 'pure_code'
insertQuery = 'INSERT INTO [' + table_q + '] ([' + query_id + '], ' \
'[' + type_col + '], ' \
'[' + descr_col + '], ' \
'[' + code_col + '])' \
'\nVALUES (?, ?, ?, ?)'
and then the conncection to the db:
with conn:
cursor = conn.cursor()
for file, text_block in all_data.items():
file = os.path.splitext(file)[0]
for subtext in text_block:
cat = text_block[subtext]
qName = cat['query_name']
qType = cat['type']
qDescr = cat['descr_de']
qCont = cat['content']
cursor.execute(insertQuery, (qName, qType, qDescr, qCont))
So performance-wise I feel this is not efficient - I call the SQL INSERT Code every time I run through the loop (every insert row) but I'm not coming from the technical side originally so I'm not sure if there is a way that's a lot more efficient.
My alternative idea was to load my nested dicitonary results into a pandas dataframe (or save the results into a dataframe from the beginning) and then insert the whole dataframe into the database at once. I didn't do this in pandas before but i assume it should be possible.
Would this make a huge difference performance wise? Or what else would be the most efficient way to do this?