Does PyNaCl release the Global Interpreter Lock? Will it be suitable to use multithreading for encryption using PyNaCl's SecretBox
?
I want to encrypt relatively large amounts of data (~500 MB) using PyNaCl. For this, I divide it into chunks of around 5MB and encrypt them using ThreadPoolExecutor()
. Is this ideal? I don't know if PyNaCl releases Python's GIL and whether it will actually encrypt the chunks in parallel resulting in performance gains.
Edit: To avoid confusion, let me clear the experimental results. After testing the current implementation a bunch of times, I found out that it was slightly faster than encrypting the entire data directly, and extremely faster than a simple for
loop. However, I need hard evidence (reference to documentation, or some kind of test) to prove that the task is indeed running in parallel and GIL is not blocking performance.
This is my current implementation using ThreadPoolExecutor()
from concurrent.futures import ThreadPoolExecutor
from os import urandom
from typing import Tuple
from nacl import secret
from nacl.bindings import sodium_increment
from nacl.secret import SecretBox
def encrypt_chunk(args: Tuple[bytes, SecretBox, bytes, int]):
chunk, box, nonce, macsize = args
try:
outchunk = box.encrypt(chunk, nonce).ciphertext
except Exception as e:
err = Exception("Error encrypting chunk")
err.__cause__ = e
return err
if not len(outchunk) == len(chunk) + macsize:
return Exception("Error encrypting chunk")
return outchunk
def encrypt(
data: bytes,
key: bytes,
nonce: bytes,
chunksize: int,
macsize: int,
):
box = SecretBox(key)
args = []
total = len(data)
i = 0
while i < total:
chunk = data[i : i + chunksize]
nonce = sodium_increment(nonce)
args.append((chunk, box, nonce, macsize,))
i += chunksize
executor = ThreadPoolExecutor(max_workers=4)
out = executor.map(encrypt_chunk, args)
executor.shutdown(wait=True)
return out