RESOLUTION: Seems qiskit does something weird when you re-use circuits that means running them takes extra time. The solution was to pull
for idx, image in enumerate(this_data):
np.random.seed(42)
quantum_circuits = [generate_random_circuit(depth=10,num_qubits=4,prob_appl_single=0.3,prob_appl_multi=0.7) for _ in range(2)]
into the for loop, for anyone interested.
I've seen Why simple Python program slows down as time progresses? and Python Script slowing down as it progresses? and as you'll see I've tried to implement all of the learning I could from there but that hasn't helped.
I'm trying to perform some computation to an array of data but it gets significantly slower as time progresses. Specifically
from quantum_pipe import generate_random_circuit, conv
import pandas as pd
import numpy as np
import gc
train_data = pd.read_csv('./fashion-mnist/fashion-mnist_train.csv')
this_data = train_data[args.start_idx:args.start_idx+args.num_datapoints].drop(['label'],axis=1)
this_data = (this_data / 255).round().astype(np.uint8).values
train_data = None # this is just for garbage collector to deal with
quantum_circuits = [generate_random_circuit(depth=10,num_qubits=4,prob_appl_single=0.3,prob_appl_multi=0.7) for _ in range(12)]
img_outputs = []
for idx, image in enumerate(this_data):
image = image.reshape((28,28))
outputs = [conv(qc, 2, image) for qc in quantum_circuits]
print(f"IMAGE COMPLETED: {idx+1} of {args.num_datapoints}")
img_outputs.append(outputs)
gc.collect()
np.save(f'./quantum_data/start{args.start_idx}to{args.start_idx+args.num_datapoints}.npy',img_outputs)
and my timing is looking at each time it completes an element of the list comprehension in output. My conv function is defined as
def conv(qc, filter_size, image, mode='threshold'):
''' Write the loops to slide our 'filter' over our image '''
# here filter doesn't actually matter, we just use the flattened binary list as our init
# might as well hard-code 3x3 filters, can happily handle 2^9 = 512 states
start = time.time()
prepped_img = prepare_img(filter_size, image)
print(prepped_img.shape)
img_height, img_width = prepped_img.shape
conv_output = np.zeros(image.shape)
for down_idx in range(img_height - (filter_size-1)):
for across_idx in range(img_width - (filter_size-1)):
section = prepped_img[down_idx:down_idx + filter_size, across_idx: across_idx + filter_size]
init_arr = encoding_function(section,mode)
qc.initialize(init_arr, qc.qubits)
job = execute(qc, BACKEND, shots=500)
results = job.result()
counts = results.get_counts(qc)
output = np.zeros(len(init_arr))
for key, value in counts.items():
keyidx = int(key,2)
output[keyidx] = value
output = output/ np.sqrt(np.sum(output**2))
entropy = shannon_entropy(output)
conv_output[down_idx,across_idx] = entropy
print(f"filter completed in {time.time()-start} s")
return conv_output
Looking at my terminal output I see something like
filter completed in 5.358206748962402 s
(15, 15)
filter completed in 5.510828733444214 s
(15, 15)
filter completed in 5.9010186195373535 s
(15, 15)
filter completed in 5.649052143096924 s
(15, 15)
filter completed in 5.400753021240234 s
(15, 15)
filter completed in 6.231165647506714 s
(15, 15)
filter completed in 6.202470779418945 s
(15, 15)
filter completed in 5.346660852432251 s
(15, 15)
filter completed in 5.406870365142822 s
(15, 15)
filter completed in 5.373116731643677 s
(15, 15)
filter completed in 6.068220853805542 s
IMAGE COMPLETED: 1 of 150
(15, 15)
filter completed in 9.925729036331177 s
(15, 15)
filter completed in 10.407076597213745 s
(15, 15)
filter completed in 10.152106523513794 s
(15, 15)
filter completed in 10.058736801147461 s
(15, 15)
filter completed in 10.039350509643555 s
(15, 15)
filter completed in 10.171052694320679 s
(15, 15)
filter completed in 10.037626266479492 s
(15, 15)
filter completed in 9.979500770568848 s
(15, 15)
filter completed in 10.973601579666138 s
(15, 15)
filter completed in 10.002075433731079 s
(15, 15)
filter completed in 9.934457540512085 s
(15, 15)
filter completed in 10.39132308959961 s
IMAGE COMPLETED: 2 of 150
(15, 15)
filter completed in 14.923802375793457 s
(15, 15)
filter completed in 14.426809549331665 s
(15, 15)
filter completed in 14.47310185432434 s
(15, 15)
filter completed in 14.773804187774658 s
(15, 15)
filter completed in 14.724096059799194 s
(15, 15)
filter completed in 14.462066650390625 s
(15, 15)
filter completed in 14.38855791091919 s
(15, 15)
filter completed in 14.864516735076904 s
(15, 15)
filter completed in 16.536264657974243 s
(15, 15)
filter completed in 15.544280052185059 s
(15, 15)
filter completed in 14.404176235198975 s
(15, 15)
filter completed in 14.43254804611206 s
IMAGE COMPLETED: 3 of 150
(15, 15)
filter completed in 18.811672687530518 s
(15, 15)
filter completed in 19.01636004447937 s
(15, 15)
filter completed in 19.908296823501587 s
(15, 15)
filter completed in 21.753503561019897 s
(15, 15)
filter completed in 20.8498797416687 s
(15, 15)
filter completed in 20.544127941131592 s
(15, 15)
filter completed in 20.662254333496094 s
(15, 15)
filter completed in 20.65923762321472 s
(15, 15)
filter completed in 19.335585355758667 s
(15, 15)
filter completed in 19.945393323898315 s
(15, 15)
filter completed in 20.45564317703247 s
(15, 15)
filter completed in 20.289149522781372 s
IMAGE COMPLETED: 4 of 150
(15, 15)
filter completed in 25.40975260734558 s
(15, 15)
filter completed in 24.785842418670654 s
(15, 15)
filter completed in 24.025460720062256 s
(15, 15)
filter completed in 23.852452516555786 s
(15, 15)
filter completed in 24.38658046722412 s
(15, 15)
filter completed in 23.814111948013306 s
This shows me that each filter for the same image takes about the same time, but every subsequent image takes longer and longer, usually by a constant amount. Seeing as I need to do this for many images, this isn't sustainable for me, and on the uni machine, it went from 3s a filter to 70s a filter.
My repo is here https://github.com/milanleonard/QNN-implementation if there's any info I'm missing. I cannot imagine why this is slowing down.
For some extra info, it seems to be the STEP in the loop that is causing the problem, since reducing the amount of quantum_circuits, i.e. number of filters per image, doesn't help at all. Refactoring to save at each step didn't help, and my memory isn't growing quickly anywhere.
We thought that it was maybe the append to img_outputs that was the problem, but this refactor shows it wasn't.
for idx, image in enumerate(this_data):
image = image.reshape((28,28))
image = prepare_img(2,image)
outputs = [conv(qc, 2, image) for qc in quantum_circuits]
print(f"IMAGE COMPLETED: {idx+1} of {args.num_datapoints}")
np.save(f'quantum_data/img{args.start_idx + idx}.npy',outputs)
outputs, image = None, None
After some profiling as suggested by Michael, I see this
filter completed in 16.571566104888916 s
filter completed in 16.666788339614868 s
15378121 function calls (15376161 primitive calls) in 33.240 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
788 4.492 0.006 4.492 0.006 {method 'acquire' of '_thread.lock' objects}
60368 3.586 0.000 9.891 0.000 dagcircuit.py:292(apply_operation_back)
241472 1.336 0.000 3.593 0.000 copy.py:66(copy)
1576240 1.166 0.000 1.188 0.000 {built-in method builtins.isinstance}
930412 0.674 0.000 0.674 0.000 bit.py:75(__hash__)
60368 0.672 0.000 1.220 0.000 assemble_circuits.py:71(<listcomp>)
244216 0.659 0.000 2.253 0.000 {built-in method builtins.all}
60760 0.611 0.000 1.113 0.000 quantumcircuit.py:581(_update_parameter_tabl
2ND RUN
filter completed in 36.4175808429718 s
filter completed in 36.10355544090271 s
36660244 function calls (36658284 primitive calls) in 72.523 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
137200 9.124 0.000 25.409 0.000 dagcircuit.py:292(apply_operation_back)
784 7.039 0.009 7.039 0.009 {method 'acquire' of '_thread.lock' objects}
548800 2.966 0.000 8.067 0.000 copy.py:66(copy)
3958024 2.815 0.000 2.836 0.000 {built-in method builtins.isinstance}
2467052 1.752 0.000 1.752 0.000 bit.py:75(__hash__)
137592 1.650 0.000 3.057 0.000 quantumcircuit.py:581(_update_parameter_table)
551544 1.539 0.000 5.519 0.000 {built-in method builtins.all}
984312 1.509 0.000 2.749 0.000 retworkx_dagcircuit.py:62(_add_multi_graph_edge)
137200 1.495 0.000 2.727 0.000 assemble_circuits.py:71(<listcomp>)
489020 1.445 0.000 2.162 0.000 register.py:98(__repr__)
Looks like the number of calls to ApplyOperationBack grows linearly with iterations of the loop
60368 3.497 0.000 9.678 0.000 dagcircuit.py:292(apply_operation_back)
137200 8.993 0.000 25.163 0.000 dagcircuit.py:292(apply_operation_back)
214032 15.435 0.000 41.346 0.000 dagcircuit.py:292(apply_operation_back)
290864 20.282 0.000 54.842 0.000 dagcircuit.py:292(apply_operation_back)
Guess I should go hunting for why this is growing.