So I need to insert about 700k items into a table. I am using pool. If I insert 1000 at the time, a lot of data becomes distorted in my db and it won't insert all data. And if I do 100 at the time, it does not insert the entire list for some reason. It only adds a lower amount that is divisible by 100 (so it does not execute the last line of code in the second part). Is there maybe a better way to do this?
def initialise(**kwargs):
con_pool = pool.SimpleConnectionPool(1, 100, **kwargs)
return con_pool
def get_connection(con_pool):
return con_pool.getconn()
@contextmanager
def get_cursor(con_pool):
try:
con = get_connection(con_pool)
yield con.cursor()
except Exception as e:
logger_local.warning(f'context manager for sql_funcs error: {e}')
print(e)
finally:
con.commit()
con_pool.putconn(con)
def insert_data_bulk(con_pool, insert, data, counter=0):
try:
with get_cursor(con_pool) as cursor:
execute_batch(cursor, insert, data)
counter += 1
return counter
except Exception as e:
print('insert_data_bulk error: ', e)
return counter
And this executes them in another module, basically clearing data_bulk every 100 items (since it does not work if I throw in the whole thing at once). I tried both execute_batch and execute_values, and they both are faulty.
data_bulk = []
for item in txt_data_lines:
parse data here and append to data_bulk
if len(data_bulk) == 100:
sql_funcs.insert_data_bulk(index_pool, insert, data_bulk)
data_bulk.clear()
sql_funcs.insert_data_bulk(index_pool, insert, data_bulk)
to add, the get_cursor function actually gives error "not all arguments converted during string formatting" when I use execute_values