0

I wrote a singlethreaded prime-finding algorithm a while ago, and decided to try to optimise it using threading. Unfortunately, the multithreaded version is at best running 3/4 the speed of the singlethreaded one. After some research it seems that the 'threading.Lock' function is what slows it down so much. How can I improve the code?

The single-threaded code:

from math import sqrt, ceil
import time
import pickle

os.system("mode con: lines=8")
os.system("title Prime finder in Python")
primes = []


#Load primes from logprimes.data
with open("logprimes.data", "rb") as f:
    primes = pickle.load(f)
if len(primes) < 4:
    primes = [2, 3, 5, 7]

#Function to save results to logprimes.data
def SaveRes():
    with open("logprimes.data", "wb") as f:
        pickle.dump(primes, f)
    return("Saved " + str(len(primes)) + " primes.")

#Function to find out if a number is a prime
def is_prime(n):
    i = 0
    while primes[i] < ceil(sqrt(n)):
        if n % primes[i] == 0:
            return False
        i = i + 1
    return True

starttime = time.time()
currentcheck = int(primes[len(primes) - 1])
lastsaved = time.time()
lastshowed = time.time()
#Main loop
os.system("cls")
while True:
    try:
        currentcheck = currentcheck + 1
        if is_prime(currentcheck) == True:
            primes.append(currentcheck)

        if time.time() - lastshowed > 4:
            print("\n\n\nTotal time: " + str(time.time() - starttime))
            print("Currently checking: " + str(currentcheck))
            print("Total number of primes: " + str(len(primes)))
            print("Primes per sec: " + str(len(primes) / float(time.time() - starttime)))
            print("Checks per sec: " + str(currentcheck / float(time.time() - starttime)))
            print("Prime rarity: " + str(len(primes) / currentcheck) + "%")
            lastshowed = time.time()

        if time.time() - lastsaved > 300:
            os.system("cls")
            print(SaveRes())
            lastshowed = time.time() + 2
            lastsaved = time.time()

    except (KeyboardInterrupt, SystemExit):
        os.system("cls")
        print(SaveRes())
        lastsaved = time.time()
        lastshowed = time.time() + 2

The multi-threaded code:

from math import sqrt, ceil
import time
import pickle
import threading

#os.system("mode con: lines=8")
os.system("title Prime finder in Python")
running = True
primes = []
threads = []


#Load primes from logprimes.data
with open("logprimes.data", "rb") as f:
    primes = pickle.load(f)
if len(primes) < 4:
    primes = [2, 3, 5, 7]


#Function to save results to logprimes.data
def SaveRes():
    with open("logprimes.data", "wb") as f:
        pickle.dump(primes, f)
    return("Saved " + str(len(primes)) + " primes.")


#Function to find out if a number is a prime
def is_prime(n):
    while primes[-1] < ceil(sqrt(n)): #Wait until we have enough primes to correctly deem wether or not 'n' is a prime. I know, not the most optimal solution, but it works.
        pass
    i = 0
    while primes[i] < ceil(sqrt(n)):
        if n % primes[i] == 0:
            return False
        i = i + 1
    return True


def threadfunction(threadid, startnum = primes[-1] + 1):
    global primes, running
    currentcheck = startnum

    with lock:
        print(threadid)
    while running:
        if not currentcheck in primes: 
            if is_prime(currentcheck * (threadid + 1)):
                with lock:
                    primes.append(currentcheck)
        currentcheck += 1

    exit()



startsize = len(primes)
starttime = time.time()
starthighestprime = primes[-1]
#currentcheck = int(primes[len(primes) - 1])
lastsaved = time.time()
lastshowed = time.time()


#Create threads
lock = threading.Lock()
for i in range(os.cpu_count()):
    threads.append(
        threading.Thread(
            target=threadfunction, 
            args=(i,)
            )
        )
    threads[-1].start()


#Main loop
os.system("cls")
while running and __name__ == "__main__":
    try:

        time.sleep(1)
        if time.time() - lastshowed > 4:
            print("\n\n\nTotal time: " + str(time.time() - starttime))
            print("Last found prime: " + str(primes[-1]))
            print("Total number of primes: " + str(len(primes)))
            print("Primes per sec: " + str((len(primes) - startsize) / float(time.time() - starttime)))
            print("Checks per sec: " + str((primes[-1] - starthighestprime) / float(time.time() - starttime)))
            print("Prime rarity: " + str((len(primes) / primes[-1]) * 100) + "%")
            lastshowed = time.time()

        if time.time() - lastsaved > 300:
            os.system("cls")
            print(SaveRes())
            lastshowed = time.time() + 2
            lastsaved = time.time()

    except (KeyboardInterrupt, SystemExit):
        os.system("cls")
        print(SaveRes())
        lastsaved = time.time()
        lastshowed = time.time() + 2

exit()```
  • 2
    It seems like your problem is likely the [Global Interpreter Lock](https://stackoverflow.com/questions/1294382/what-is-the-global-interpreter-lock-gil-in-cpython) - in CPython, only one thread can execute Python bytecode at a time. So adding threads to an application like this will not help performance. – dano Jun 05 '20 at 14:37
  • 1
    Read [what-are-the-differences-between-the-threading-and-multiprocessing-modules](https://stackoverflow.com/questions/18114285) – stovfl Jun 05 '20 at 14:40
  • FYI: Your `threadfunction` function tests `currentcheck in primes` without locking the lock. That may be unrelated to your problem, but it probably is a Bad Thing. If you need to lock a data structure, then in most cases that means you need to lock it even in threads that are only just reading the structure. The problem is, data structures have... _structure_. If you allow one thread to _navigate_ that structure (e.g., to find out if `currentcheck` is in the list) while some other thread is concurrently modifying the list, the reader could follow a bad pointer and crash the program, or worse. – Ohm's Lawman Jun 05 '20 at 15:33

0 Answers0