3

Does PyNaCl release the Global Interpreter Lock? Will it be suitable to use multithreading for encryption using PyNaCl's SecretBox?

I want to encrypt relatively large amounts of data (~500 MB) using PyNaCl. For this, I divide it into chunks of around 5MB and encrypt them using ThreadPoolExecutor(). Is this ideal? I don't know if PyNaCl releases Python's GIL and whether it will actually encrypt the chunks in parallel resulting in performance gains.

Edit: To avoid confusion, let me clear the experimental results. After testing the current implementation a bunch of times, I found out that it was slightly faster than encrypting the entire data directly, and extremely faster than a simple for loop. However, I need hard evidence (reference to documentation, or some kind of test) to prove that the task is indeed running in parallel and GIL is not blocking performance.

This is my current implementation using ThreadPoolExecutor()

from concurrent.futures import ThreadPoolExecutor
from os import urandom
from typing import Tuple

from nacl import secret
from nacl.bindings import sodium_increment
from nacl.secret import SecretBox


def encrypt_chunk(args: Tuple[bytes, SecretBox, bytes, int]):
    chunk, box, nonce, macsize = args
    try:
        outchunk = box.encrypt(chunk, nonce).ciphertext
    except Exception as e:
        err = Exception("Error encrypting chunk")
        err.__cause__ = e
        return err
    if not len(outchunk) == len(chunk) + macsize:
        return Exception("Error encrypting chunk")
    return outchunk


def encrypt(
    data: bytes,
    key: bytes,
    nonce: bytes,
    chunksize: int,
    macsize: int,
):
    box = SecretBox(key)
    args = []
    total = len(data)
    i = 0
    while i < total:
        chunk = data[i : i + chunksize]
        nonce = sodium_increment(nonce)
        args.append((chunk, box, nonce, macsize,))
        i += chunksize
    executor = ThreadPoolExecutor(max_workers=4)
    out = executor.map(encrypt_chunk, args)
    executor.shutdown(wait=True)
    return out
kush
  • 154
  • 1
  • 12
  • Wouldn't it be easiest to simply run it and see how fast it is? – zvone Aug 13 '23 at 08:25
  • @zvone The current implementation is slightly faster than direct encryption. Encrypting 500MB directly takes close to 0.5 seconds whereas this takes 0.3 seconds on average. – kush Aug 13 '23 at 09:57
  • Doesn't that answer your question? If it's faster with more than one thread, there appears to be some parallelism involved, doesn't it? What is your question? Did you expect it to be 100 times faster? – zvone Aug 13 '23 at 14:11
  • 2
    @zvone I know the performance is slightly improved. But my question was whether the GIL is released or not. I did not find anything regarding GIL in PyNaCl documentation. I am writing a document which requires everything I write to be fact checked. That's why I want to know what happens to the GIL when a thread uses PyNaCl to encrypt data. I cannot make claims about parallelism without showing any relevant code snippets or documentation. – kush Aug 13 '23 at 16:50

1 Answers1

5

PyNaCl uses the Common Foreign Function Interface (CFFI) to provide bindings to the C library Libsodium. We can see that the SecretBox function is basically a binding for the crypto_secretbox() function of the Libsodium library.

As per CFFI documentation:

[2] C function calls are done with the GIL released.

Since most of the functions from PyNaCl are bindings to the Libsodium library using CFFI, they will release the Global Interpreter Lock during the execution of the C function.

This should explain the performance improvements from multithreading.

kush
  • 154
  • 1
  • 12