0

I am using Python threading to do some jobs at the same time. I leave the main thread to perform task_A, and create one thread to perform task_B at the same time. Below is the simplified version of the code I am working on:

import threading
import numpy as np

def task_B(inc):
    for elem in array:
        value = elem + inc

if __name__ == '__main__':

    array = np.random.rand(10)

    t1 = threading.Thread(target=task_B, args=(1))
    t1.start()

    # task_A
    array_copy = list()
    for elem in array:
        array_copy.append(elem)

    t1.join()

I know the above code doesn't do something meaningful. Please think of it as a simplified example. As you can see, variable array is read-only both in the main thread and the newly created thread t1. Therefore, there is no need to lock array in both the main thread and the t1 thread, since none of them modifies (or writes) the variable. However, when I timed the code, it seems that Python threading automatically locks variables that are shared between threads, even though they are read-only. Is there a way to make each thread run simultaneously without locking the read-only variables? I've found this code, but cannot figure out how to apply it to my situation.

SHM
  • 61
  • 1
  • 8
  • what do you mean? of course it is being locked, you have one thread working on an array while the other is trying to read from it. one of them is modifying state the same time the other is trying to read state, you have a data race condition without locking here – gold_cy Jan 09 '22 at 13:37
  • 1
    You are discovering CPytjon's [GIL](https://en.m.wikipedia.org/wiki/Global_interpreter_lock) – azelcer Jan 09 '22 at 13:39
  • @gold_cy Both threads are accessing the variable but they are only reading it, not modifying it. (Main thread simply reads `elem`s in `array` and appends them to another list, and `t1` thread reads `elem`s in `array` and saves them to the `value` with `inc` added, not actually altering the `array` itself.) So I believe that there is no need for a lock. – SHM Jan 10 '22 at 01:00

1 Answers1

1

You are correct saying that in this case "there is no need for a lock", but the CPython interpreter (that I guess you use to run your Python code) is not that smart.
Python code always execute while holding the GIL, so that both threads execute exclusively from one another (instead of concurrently), although in an interleaved manner (which would not be the case without threads, the execution would be purely sequential).
That's the reason why performance-critical code is often offloaded to other *processes (using the multiprocessing library) or written in Cython (here an example solving a problem similar to yours).
See that question for a little more details on why the GIL is there : Is there a way to release the GIL for pure functions using pure python?.

There is hope that in the future (2022+) the Gil may be relaxed, but for now you are stuck with it, so work around it.

Lenormju
  • 4,078
  • 2
  • 8
  • 22