I need to execute a SELECT statement thousands of times. I have a list of 700,000 unique numbers on an excel sheet. I am using openpyxl to iterate through the excel sheet, and execute a select statement with the unique number for each row, and writing the results back to excel.
Currently, this is taking approx 50 seconds per 1000 rows, so at this rate, it will take almost 10 hours to finish. There has to be a better way.
import pyodbc
import openpyxl
import datetime
wb = openpyxl.load_workbook('doc.xlsx')
ws = wb.active
conn = pyodbc.connect(f'Driver=driver_name; '
'Server=192.168.1.111\Server,1433;'
'pool_pre_ping=True;'
'pool_recycle=3600;'
'UID=user;'
'PWD=pass;',
timeout=1
)
cursor = conn.cursor()
now = datetime.datetime.now()
for row in range(2,704246):
results = ''
number= ws.cell(row=row, column=1).value
sql = f"SELECT col1, col2 FROM database WHERE col3 = '{number}'"
matches = cursor.execute(sql).fetchall()
for x, y in enumerate(matches):
result = str(y[0]) + ' & ' + str(y[1])
if x == 0:
results = result
else:
results = results + f', {result}'
if row % 1000 == 0:
print(f'Row = {row}, time - {datetime.datetime.now() - now}')
ws.cell(row=row, column=2).value = results
wb.save('doc.xlsx')
print('DONE')