2

I am new at Python and have been trying to use multithreading. There already exists an in-depth comment on Stackoverflow on the topic, but I still have some questions.

The goal of my program is to create and populate an array (although I guess that technically it would have to be called a "list" in Python) and have it sorted by a "divide and conquer" algorithm. Unfortunately, the terms "list" and "array" seem to be conflated by many a user, even though they are not the same. If "array" is used in my comment, please bear in mind that I have posted different code from various resources and have, for the sake of respecting the original author/s, not changed its content.

My code for populating the list count was quite simple

#!/usr/bin/env python3
count = []
i = 149
while i >= 0:
    count.append(i)
    print(i)
    i -= 1

After that I used this very handy guide on the topic of "divide and conquer" to create two lists for sorting, which were merged later on. My main concern is now how to use those lists correctly with multithreading.

In the earlier mentioned post it was argued that, basically, just a few lines of code were needed to use multithreading:

from multiprocessing.dummy import Pool as ThreadPool 
pool = ThreadPool(4)

as well as

results = pool.starmap(function, zip(list_a, list_b))

to pass multiple lists.

I have tried to adapt the code but failed. My function's parameters are def merge(count, l, m, r) (used to divide the list count into a left and a right part) and the two temporarily created lists are called L and R.

def merge(arr, l, m, r): 
    n1 = m - l + 1
    n2 = r- m 

    # create temp arrays 
    L = [0] * (n1) 
    R = [0] * (n2) 

But every time I run the program, it just responds with the following error message:

Traceback (most recent call last):
  File "./DaCcountdownTEST.py", line 71, in <module>
    results = pool.starmap(merge,zip(L,R))
NameError: name 'L' is not defined

I don't know the cause for my problem.

Any help is greatly appreciated!

martineau
  • 119,623
  • 25
  • 170
  • 301
MichaelP
  • 41
  • 10
  • It sounds like wherever the `results = pool.starmap(merge,zip(L,R))` statement is executing, there's no `L` variable defined. – martineau Nov 11 '18 at 02:12
  • I think there may be a number of problems with your code. But the immediate issue seems to be that `L` and `R` are defined within the scope of `merge`, but you're trying to use them in an outer scope where they're not defined. – tel Nov 11 '18 at 02:12
  • Thank you for your help. As both of you said, this is probably a scope-problem that I have missed. Since I'm new at python, I may have unwittingly messed up the indentation. Oh well. Considering the "number of problems", I beg to differ. The code used for populating the list works, I have tried it and it does what it's supposed to do. The rest of the code I copy-pasted from GeeksForGeeks, and it should also work. So I think that this is, as they say, a "layer 8 problem". So I'll have to get down and dirty and dig around in my code until I have found the source of my problem. – MichaelP Nov 11 '18 at 12:36
  • Thank you martineau for the edit, by the way. – MichaelP Nov 11 '18 at 12:54

1 Answers1

2

I'm not sure exactly what's going wrong with your code, but here's a complete working example of a multi-threaded version of the mergeSort code you linked to:

from multiprocessing.dummy import Pool as ThreadPool 

# Merges two subarrays of arr[]. 
# First subarray is arr[l..m] 
# Second subarray is arr[m+1..r] 
def merge(arr, l, m, r): 
    n1 = m - l + 1
    n2 = r- m 

    # create temp arrays 
    L = [0] * (n1) 
    R = [0] * (n2) 

    # Copy data to temp arrays L[] and R[] 
    for i in range(0 , n1): 
        L[i] = arr[l + i] 

    for j in range(0 , n2): 
        R[j] = arr[m + 1 + j] 

    # Merge the temp arrays back into arr[l..r] 
    i = 0     # Initial index of first subarray 
    j = 0     # Initial index of second subarray 
    k = l     # Initial index of merged subarray 

    while i < n1 and j < n2 : 
        if L[i] <= R[j]: 
            arr[k] = L[i] 
            i += 1
        else: 
            arr[k] = R[j] 
            j += 1
        k += 1

    # Copy the remaining elements of L[], if there 
    # are any 
    while i < n1: 
        arr[k] = L[i] 
        i += 1
        k += 1

    # Copy the remaining elements of R[], if there 
    # are any 
    while j < n2: 
        arr[k] = R[j] 
        j += 1
        k += 1

# l is for left index and r is right index of the 
# sub-array of arr to be sorted 
def mergeSort(arr,l=0,r=None):
    if r is None:
        r = len(arr) - 1

    if l < r: 
        # Same as (l+r)/2, but avoids overflow for 
        # large l and h 
        m = (l+(r-1))//2

        # Sort first and second halves
        pool = ThreadPool(2)
        pool.starmap(mergeSort, zip((arr, arr), (l, m+1), (m, r)))
        pool.close()
        pool.join()

        merge(arr, l, m, r)

Here's a short test of the code:

arr = np.random.randint(0,100,10)
print(arr)
mergeSort(arr)
print(arr)

which produces the following output:

[93 56 55 60  0 28 17 77 84  2]
[ 0  2 17 28 55 56 60 77 84 93]

Sadly, it does seem to be quite a bit slower than the single threaded version. However, this kind of slowdown is par for the course when it comes to multithreading compute-bound tasks in Python.

tel
  • 13,005
  • 2
  • 44
  • 62
  • Thank you for your answer. This is exactly the code I used. I copy-pasted it to see if it did what I wanted. But I got an error code. I'm quite sure that it must be my fault, probably something that I missed, eventually a scope-problem like user tel suggested. Considering it being slow, I guess that would be the GIL. I'll have to deal with that at some point later on, but for the moment I'm mainly interested in getting it up and running. – MichaelP Nov 11 '18 at 12:28
  • I have copy-and-pasted the code again and it works. So it really was me being clumsy. I probably should have gone to bed earlier. Thank you again for your help - now I'll have all the time that I want (and need) to deal with that pesky little GIL. – MichaelP Nov 11 '18 at 12:51