Sharing Data that Updates in Real Time in Python Multi-Processing

Question

I was trying to help someone with their homework. The goal is to try to multiprocess this program and help them optimize it. Because the functions modify work with data thats constantly being modified in real time, he thought making the variables would help, but it doesnt in multiprocessing because of how multiprocessing works.

The issue is though I dont know much about shared memory multiprocessing. I know how map pool stuff create process and all that, however making them update and rely on the same information concurrently is a bit trickier for me. I already know what I need to do and optimizations I am going to write for the code. I just want to know about how I can go about fixing this specific problem, and any suggestions therein.

Edit for some clarification and tldr (also cut down the code a bit):

Summary of code: So the issue is the algorithm works as follow:

take a number to factorize n
iterate through a list of prime numbers up to the sqrt of n
for each prime check if n divides into it
if it divides into it, count how many times and add that to a list of factors
save the total result of the division, reset the target of prime numbers to iterate through to be only up to the sqrt of the result of this division

This number 5 is the hard part, how do I make it such that this information is updated across all multi-processes accurately?

# PRIME FACTORIZATION
import time
import numpy as np
import math


def in_time_target_updater(new_divisor):
    global last_divisor
    global new_upto_target
    global full_number_target

    full_number_target = new_divisor
    last_divisor = new_divisor
    new_upto_target = math.sqrt(new_divisor)

def exponent_finder(num, input_target):
    """Finds how many times a prime goes into a number. For each time the number divides into a int it appends
    the number into the factors list. Once that is done it returns the last divisor in order to do updates.
    """
    global factor_list
    other_divisor = input_target
    for i in range(max_range):
        local_target = input_target//(num**(i+1))
        if other_divisor % num == 0:
            factor_list = np.append(factor_list, num)
            other_divisor = local_target
            continue
        else:
            return other_divisor

def phase1_list_iterator(list_or_queue):
    global found
    for value in list_or_queue:
        just_in_time_iteration(value)
        if found:
            break

def just_in_time_iteration(value):
    global found
    global factor_list

    if value > new_upto_target:

        found = True
        factor_list = np.append(factor_list, last_divisor)
        return found
    else:
        if full_number_target % value == 0:
            remaining_divisor = exponent_finder(value, full_number_target)
            in_time_target_updater(remaining_divisor)

def phase1():
    global factor_list
    phase1_list_iterator(prime_array)
    if found is True:
        if factor_list[-1] == 1:
            factor_list = factor_list[:-1]
        if factor_list[0] == 0:
            factor_list = np.array([num_to_factorize])
        return
    else:
         phase2()

def main_single_thread_comparative(input_):
    global num_to_factorize
    global max_range
    global og_target
    global new_upto_target
    global full_number_target
    global last_divisor
    global factor_list
    global prime_array
    global found
    global step_constant

    num_to_factorize = input_
    max_range = 200
    og_target = math.sqrt(num_to_factorize)
    new_upto_target = og_target
    full_number_target = num_to_factorize
    last_divisor = 0
    factor_list = np.empty(0, dtype=int)
    prime_array = np.load('prime_10000.npy')
    found = False
    step_constant = 2


    phase1()

Can you be clearer about what you are asking? This is a lot of lines of code, be good post a snip it. — Luiz Fernando Lobo, Jun 08 '22 at 21:58
I am not experienced to be sure, but apparently you can't use multiprocessing with shared variables, instead use multi-threading. [I tried the same and get to that conclusion]. [Similar ask](https://stackoverflow.com/a/38322893/13636459) — Alfredo Maussa, Jun 08 '22 at 23:58
So the issue is the algorithm works as follow: 1. take a number to factorize n 2. iterate through a list of prime numbers up to the sqrt of n 3. for each prime check if n divides into it 4. if it divides into it, count how many times and add that to a list of factors 5. save the total result of the division, reset the target of prime numbers to iterate through to be only up to the sqrt of the result. This number 5 is the hard part, how do I make it such that this information is updated across all multiprocesses accurately? @LuizFernandoLobo — Ari Frid, Jun 09 '22 at 01:58
The thing with multi-threading is that, from what I have in python, it doesn't actually run on different threads/cores it just takes advantage of idle time, and because this is designed to handle large digits, (ie 10**10<) this can get computationally intensive and I'm worried that the help from threading will be limited, especially since I am helping my friend and he's getting graded on performance. But I can stand to be corrected. @AlfredoMaussa — Ari Frid, Jun 09 '22 at 02:04
Python famously has a "global interpreter lock" that makes it both easy to implement correct multi-threaded apps and hard to get much performance improvement. See for example https://realpython.com/python-gil/ — Gene, Jun 09 '22 at 02:23
That's exactly why I want to do this with Multiprocessing and not threading, just don't know how because of the above. @Gene — Ari Frid, Jun 09 '22 at 02:28
Then you'll need to run one Python shell per desired process and have them share values via https://docs.python.org/3/library/multiprocessing.shared_memory.html . The code above is not close to a solution. This is a very expensive way to get what could and should be done with threads in a language that supports them better. — Gene, Jun 09 '22 at 03:40

Sharing Data that Updates in Real Time in Python Multi-Processing

0 Answers0