1

I'm trying to use all the CPU so I'm using threading package

But I get similar time using one that ten threads (in a 12 threads cpu)

I believe there is a limit in python, but not sure, int top I see only 133% CPU.

I put the code but I think it is not software defect.

class normalizeTh(threading.Thread):
    def __init__(self, image, idx):
        self.image = image
        self.output = image
        self.idx = idx
        threading.Thread.__init__(self)

    def run(self):
        # print("test")
        self.output = exposure.equalize_adapthist(self.image, clip_limit=0.03)
numTheads = 10
def normalizeImgTh(X):
    global numThreads
    idx = 0
    dest = np.empty(X.shape)
    ths = []
    for img in tqdm(X):
        # if we have all threads used, wait until fist is free
        if len(ths) >= numThreads:
            ths[0].join()
            dest[ths[0].idx] = ths[0].output
            del ths[0]
        nTh = normalizeTh(img, idx)
        nTh.start()
        ths.append(nTh)
        idx += 1
        #delete all finished threads... garbage out
        for i in range(len(ths),0,-1):
            if not ths[i-1].is_alive():
                dest[ths[0].idx] = ths[0].output
                del ths[i-1]
    # wait for all pending threads.
    while len(ths) > 0:
        ths[0].join()
        dest[ths[0].idx] = ths[0].output
    return dest


dest=normalizeImgTh(X_train)
Mquinteiro
  • 1,034
  • 1
  • 11
  • 31
  • Possible duplicate/related of [How many threads is too many?](https://stackoverflow.com/questions/481970/how-many-threads-is-too-many) – Taku Jul 01 '17 at 10:10
  • @abccd I appreciate the reference to that post but I'm not agree with your appreciation. The threads haven't mutual exclusion areas neither ones wait for another and the CPU is idle, no memory problems and no philosophy in my question. – Mquinteiro Jul 01 '17 at 10:19
  • Where does `exposure.equalize_adapthist` come from? Does it release (C)Python's Global Interpreter Lock? (If you don't know what the Global Interpreter Lock is, you need to find out.) – Mark Dickinson Jul 01 '17 at 10:58
  • Ah, it's from scikit-image. And it's written in pure Python, not Cython, so no, it doesn't release the GIL. – Mark Dickinson Jul 01 '17 at 11:03
  • Does this answer your question? [What is the global interpreter lock (GIL) in CPython?](https://stackoverflow.com/questions/1294382/what-is-the-global-interpreter-lock-gil-in-cpython) – Sneftel Oct 17 '20 at 20:55

2 Answers2

3

The limit might have more to do with the hardware and your operating system settings than with the Python. If you are using threads for CPU bound tasks, I don't think Python is going to help due to Global Interpreter Lock.

hspandher
  • 15,934
  • 2
  • 32
  • 45
  • 1
    Indeed, however the Global Interpreter Lock is really *Python specific* (and more precisely specific to the cpython implementation most people use); it is not hardware or OS specific (so changing hardware or OS won't improve anything) – Basile Starynkevitch Jul 01 '17 at 10:47
  • @hspandher not sure what could be happening. If I replace threads by process it runs times faster! And it have no sense, start a new process have a lot of overhead and it must be less efficiency – Mquinteiro Jul 01 '17 at 10:54
  • 1
    In Python only one thread executes at a time, irrespective of how many you started. In your tasks are mix of CPU and IO bound, then it's fine. Otherwise using processes is your only option. Then again you need the find the optimal number of processes for your computer. – hspandher Jul 01 '17 at 11:00
  • 1
    @hspandher: That's a bit of an oversimplification: it's possible for Python threads to release the GIL for periods where they don't have to do any Python-level object allocation / reference counting / other things that need the GIL, and many of skimage's functions (mostly those written in Cython) already [do this](https://github.com/scikit-image/scikit-image/pull/1519), so it's reasonable to use thread-level parallelization with those functions. Unfortunately for the OP, `exposure.equalize_adapthist` isn't one of them. – Mark Dickinson Jul 01 '17 at 11:09
  • @hspandher I can't believe it!! I believe you but can't believe that behavior. Can some one give me link to read about it. – Mquinteiro Jul 01 '17 at 11:29
  • @MarkDickinson but actually what I'm doing is exactly the same that they do in your link [link](https://github.com/scikit-image/scikit-image/pull/1519) – Mquinteiro Jul 01 '17 at 11:33
  • 1
    @Mquinteiro: A Google search for "python global interpreter lock" will give you plenty of reading material. – Mark Dickinson Jul 01 '17 at 12:08
  • 1
    @Mquinteiro You can read about Python global interpreter lock anywhere, say http://jessenoller.com/blog/2009/02/01/python-threads-and-the-global-interpreter-lock. If your tasks are IO bound, you might wanna look into asynchronous IO (asyncio) as well. – hspandher Jul 01 '17 at 12:19
0

I'm trying to use all the CPU so I'm using threading package

But I get similar time using one that ten threads (in a 12 threads cpu)

I'm aware that this question was posted over three years ago.

If you're using a standard distribution of Python, your system will only execute one Python thread at a time, including the main thread of your program, so adding more threads to your program or more cores to your system doesn't really get you anything when using the threading module in Python. You can research all of the pedantic details and ultracrepidarian opinions regarding the GIL / Global Interpreter Lock for more info on that.

What that means is that cpu-bound (computationally-intensive) code doesn't benefit greatly from factoring it into threads.

I/O-bound (waiting for file read/write, network read, or user I/O) code, however, benefits greatly from multithreading! So, start a thread for each network connection to your Python-based server.

Threads can also be great for triggering/throwing/raising signals at set periods, or simply to block out the processing sections of your code more logically.

Probably you want to use the multiprocessing module instead of threading.

Ian Moote
  • 851
  • 8
  • 15