I'm attempting to extract data from each row of an excel sheet when the number I am testing for each row matches the predefined ranges I have created. The code works, but moves at a crawl. I'm attempting to iterate over a sheet that contains 200,000 rows of data, and noticed that when I break the sheet down into smaller segments the program executes faster, but still not as fast as I would like. I can't quite figure out why that is either.
Any recommendations on how I might get this to speed up?
import xlsxwriter
import openpyxl
import os
from tqdm import tqdm
os.chdir(r'C:\Users\JTNeumay\Desktop\test folder')
xlfile = 'test.xlsx'
wb = openpyxl.load_workbook(xlfile, read_only=True)
sheet = wb['Sheet1']
newbook = xlsxwriter.Workbook('Commercial.xlsx')
newbook2 = xlsxwriter.Workbook('Industrial.xlsx')
newsheet = newbook.add_worksheet()
newsheet2 = newbook2.add_worksheet()
i = 1
j = 1
for row in tqdm(range(1, sheet.max_row + 1)):
check = sheet.cell(row=row, column=11).value
if check is not None:
if (220000 <= check <= 221310 or 221320 <= check <= 221330 or 237200 <= check <= 237210 or 334611 <= check
<= 334612 or 420000 <= check <= 493190 or 511210 <= check <= 512210 or 512240 <= check <= 541310 or
541330 <= check <= 541340 or 541370 <= check <= 541700 or 541720 <= check <= 562900 or 562920 <= check
<= 811430 or 812000 <= check <= 983000):
for column in range(1, sheet.max_column + 1):
newsheet.write(i, column, sheet.cell(row=row, column=column).value)
i += 1
elif (210000 <= check <= 213115 or 230000 <= check <= 237130 or 237300 <= check <= 334610 or 334613 <=
check <= 339999 or 510000 <= check <= 511200 or 512220 <= check <= 512230 or check == 541320 or
check == 541360 or check == 541710 or check == 562910 or check == 811490):
for column in range(1, sheet.max_column + 1):
newsheet2.write(i, column, sheet.cell(row=row, column=column).value)
j += 1
else:
pass
else:
pass
newbook.close()
newbook2.close()
Note that I also used a method in which I created two lists with the ranges in the if statements, and found that that method took much longer.