I use python3 to do some encrypted calculation with MICROSOFT SEAL and is looking for some performance improvement. I do it by:
- create a shared memory to hold the plaintext data (Use numpy array in shared memory for multiprocessing)
- start multiple processes with multiprocessing.Process (there is a param controlling the number of processes, thus limiting the cpu usage)
- processes read from shared memory and do some encrypted calculation
- wait for calculation ends and join processes
I run this program on a 32U64G x86 linux server, cpu model is: Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz.
I notice that if I double the number of processes there is only about 20% time cost improvement. I've tried three kinds of process nums:
| process nums | 7 | 13 | 27 |
| time ratio | 0.8 | 1 | 1.2 |
Why is this improvement disproportionate to the resources i use (cpu & memory)? Conceptual knowledge or specific linux cmdlines are both welcome. Thanks.
FYI:
My code of sub processes is like:
def sub_process_main(encrypted_bytes, plaintext_array, result_queue):
// init
// status_sign
while shared_int > 0:
// seal load and some other calculation
encrypted_matrix_list = seal.Ciphertext.load(encrypted_bytes)
shared_plaintext_matrix = seal.Encoder.encode(plaintext_array)
// ... do something
for some loop:
time1 = time.time()
res = []
for i in range(len(encrypted_matrix_list)):
enc = seal.evaluator.multiply_plain(encrypted_matrix_list[i], shared_plaintext_matrix[i])
res.append(enc)
time2 = time.time()
print(f'time usage: {time2 - time1}')
// ... do something
result_queue.put(final_result)
I actually print the time for every part of my code and here is the time cost for this part of code.
| process nums | 13 | 27 |
| occurrence | 1791 | 864 |
| total time | 1698.2140 | 1162.8330 |
| average | 0.9482 | 1.3459 |
I've monitored some metrics but I don't know if there are any abnormal ones.
13 cores:
top
pidstat
vmstat
27 cores:
top (Why is this using all cores rather than exactly 27 cores? Does it have anything to do with Hyper-Threading?)
pidstat
vmstat