0

I wrote a program using threads but as it is CPU intensive, threading really isn't giving any performance boost so i started looking into multiprocessing. All my code uptill now has been written keeping in mind that two threads can work on the same global variables but that isn't possible with multiprocessing as far as i know.

Is there any way to do something like this without rewriting my whole code?

I just need to do some calculations and change the values of a few variables in the second process.

a simple example for what i intend to achieve:

target=list()
queue=list()

def func(a,b,target):
    temp=list()
    for i in range(0,a):
        for j in range(0,b)
            temp.append('a'+str(i)+'b'+str(j))
    target=list()
    
    
def process_func():
    go=True
    while go:
        target_=queue.pop(0)
        func(target[0],target[1],target[2])
        
if __name__=='__main__':
    p=multiprocessing.Process(target=process_func)
    p.start()

A main function will be adding lists to the queue. I can't figure out how to do something like this. My main aim is to just do this in a way that results in minimal changes to my existing code.

What i though i could do was use the queues from the multiprocessing module but i don't understand how exactly to implement them as the tutorials I could find would have resulted in changing my code a lot.

PythonNoob
  • 57
  • 1
  • 11
  • 1
    I think you can solve this problem by using shared memory https://docs.python.org/3/library/multiprocessing.shared_memory.html – mokumus Jul 12 '21 at 12:36
  • Thank you! I'll look into it. No idea how i missed that... – PythonNoob Jul 12 '21 at 12:38
  • Just to know if i'm understanding this properly, if i add some variable to shared memory, i can access and modify it from different processes, right? – PythonNoob Jul 12 '21 at 12:43
  • Yes, that's the purpose of the shared memory. But read/write operations are not atomic and I don't know if python internally handles the safety of the operations . – mokumus Jul 12 '21 at 12:48
  • Shared memory is a really low-level mechanism; you probably don't want to use it directly. Instead, find some library that builds on top of shared memory (or some other mechanism) to provide higher-level abstractions. – Jiří Baum Jul 12 '21 at 13:09
  • `shared_memory` basically gives you a `memoryview` of a memory-mapped file in ram in each process, so you're working with datatypes similar to `bytes` or `bytearray`. multiprocessing managers use a server process and `pickle` to communicate updates to proxy objects (proxy list, proxy dict, etc..). When you change a proxy list (append, remove, modify) those changes are sent to all other processes. – Aaron Jul 12 '21 at 13:18
  • If you do want something that's similar to having shared variables, probably the way to go would be some sort of [data store](https://en.wikipedia.org/wiki/Data_store); whether relational, or a key-value store, or some other kind; whether RAM-only or writing to disk. These will take care of (or at least warn you about) a whole lot of issues with multiprocessing; don't reinvent the wheel unless you have to. – Jiří Baum Jul 12 '21 at 13:20
  • I don't think there are going to be any multiprocessing problems in my code...there's only one process writing data and one accessing it. I already have some basic error handling in place. Thanks you for your help! – PythonNoob Jul 12 '21 at 16:55

1 Answers1

3

https://docs.python.org/3/library/multiprocessing.html#sharing-state-between-processes

You can read much here above. But, How to use multiprocessing queue in Python? explains about multiprocessing.Queue, in which case process A and process B should be given the queue object as an input to consume from the same.

VPfB
  • 14,927
  • 6
  • 41
  • 75
Aswath
  • 811
  • 7
  • 20
  • I took a look at that question but that doesn't allow me to modify variables defined in the main process to be changed by some other process. – PythonNoob Jul 12 '21 at 12:46
  • Can you put it together using queues? One queue to push items to be worked on to the workers, another queue to return the results back to the coordinator? – Jiří Baum Jul 12 '21 at 13:02
  • BTW, one advantage of building it out of queues rather than shared memory is that queues are can be scaled up to a network of machines, if that turns out to be needed. Shared memory will always be tied to just one box. – Jiří Baum Jul 12 '21 at 13:12