1

We are trying to access data between two threads, but are unable to accomplish this. We are looking for an easy (and elegant) way.

This is our current code. Goal: after the second thread/process is done, the listHolder in instance B must contain 2 items.

Class A:
   self.name = "MyNameIsBlah"

Class B:
   # Contains a list of A Objects. Is now empty.
   self.listHolder = []

   def add(self, obj):
      self.listHolder.append(obj)

   def remove(self, obj):
      self.listHolder.remove(obj)

def process(list):
    # Create our second instance of A in process/thread
    secondItem = A()
    # Add our new instance to the list, so that we can access it out of our process/thread.
    list.append(secondItem)

# Create new instance of B which is the manager. Our listHolder is empty here. 
manager = B()

# Create new instance of A which is our first item
firstItem = A()

# Add our first item to the manager. Our listHolder now contains one item now.
b.add(firstItem)

# Start a new seperate process.
p = Process(target=process, args=manager.listHolder)

# Now start the thread
p.start()

# We now want to access our second item here from the listHolder, which was initiated in the seperate process/thread.

print len(manager.listHolder) << 1
print manager.listHolder[1] << ERROR
  • Expected output: 2 A instances in listHolder.
  • Got output: 1 A instance in listHolder.

How can we access our objects in the manager with the use of a seperated process/threads, so they can run two functions simultaneously in a non-thread-blocking way.

Currently we are trying to accomplish this with processes, but if threads can accomplish this goal in a easier way, then its not a problem. Python 2.7 is used.

Update 1:

@James Mills replied with using ".join()". However, this will block the main thread until the second Process is done. I tried using this, but the Process which is used in this example will never stop execution (while True). It will act as a timer, which must be able to iterate to a list and remove objects from the list.

Anyone has any suggestion how to accomplish this and fix the current cPickle error?

Martijn Nosyncerror
  • 172
  • 1
  • 3
  • 16
  • 1
    What about using [Queues](https://docs.python.org/2/library/queue.html)? – Laur Ivan May 23 '14 at 14:17
  • 1
    Unrelated: I'd suggest using new-style classes by inheriting from `object`, e.g., `A(object)` and not using reserved keywords, like `list`, as variable names. – Midnighter May 23 '14 at 14:21
  • 1
    You can use the ``multiprocessing.Manager`` here to manage data and attributes across processes implicitly which in turn uses Queues. – James Mills May 23 '14 at 14:21
  • @Midnighter not actual code, just mockup but thanks for the tip. – Martijn Nosyncerror May 23 '14 at 14:25
  • Trying suggested out, will post results – Martijn Nosyncerror May 23 '14 at 14:26
  • 1
    Note that there's a subtle bug in your program that will cause issues if you use `multiprocessing.Manager`. Right now, it's possible that your main module will complete prior to your subprocess calling `list.append(secondItem)`. So you may get an `IndexError` trying to access `manager.listHolder[1]`, because your subprocess hasn't appendedto `listHolder` yet. You can fix it by adding a call to `p.join()` after `p.start()`. – dano May 23 '14 at 14:39
  • 1
    @MartijnNosyncerror Be aware, you might run into race condition problems. After you call `p.start()` your next line is checkinglistHolder length. But it might happen, that you do this check sooner, than started process manages to complete the task running in another process. – Jan Vlcinsky May 23 '14 at 14:39

2 Answers2

2

if James Mills answer doesn't work for you, here's a writeup of how to use queues to explicitly send data back and forth to a worker process:

#!/usr/bin/env python

import logging, multiprocessing, sys


def myproc(arg):
    return arg*2

def worker(inqueue, outqueue):
    logger = multiprocessing.get_logger()
    logger.info('start')
    while True:
        job = inqueue.get()
        logger.info('got %s', job)
        outqueue.put( myproc(job) )

def beancounter(inqueue):
    while True:
        print 'done:', inqueue.get()

def main():
    logger = multiprocessing.log_to_stderr(
            level=logging.INFO,
    )
    logger.info('setup')

    data_queue = multiprocessing.Queue()
    out_queue = multiprocessing.Queue()

    for num in range(5):
        data_queue.put(num)

    worker_p = multiprocessing.Process(
        target=worker, args=(data_queue, out_queue), 
        name='worker',
    )
    worker_p.start()

    bean_p = multiprocessing.Process(
        target=beancounter, args=(out_queue,),
        name='beancounter',
        )
    bean_p.start()

    worker_p.join()
    bean_p.join()
    logger.info('done')


if __name__=='__main__':
    main()

from: Django multiprocessing and empty queue after put

Another example of using multiprocessing Manager to handle the data is here:

http://johntellsall.blogspot.com/2014/05/code-multiprocessing-producerconsumer.html

Community
  • 1
  • 1
johntellsall
  • 14,394
  • 4
  • 46
  • 40
  • 1
    The difference here is that using a ``Queue`` is more like synchronize messages between your processes rather than synchonzing state (*some objects*). :) – James Mills May 26 '14 at 22:10
0

One of the simplest ways of Sharing state between processes is to use the multiprocessing.Manager class to synchronize data between processes (which interally uses a Queue):

Example:

from multiprocessing import Process, Manager

def f(d, l):
    d[1] = '1'
    d['2'] = 2
    d[0.25] = None
    l.reverse()

if __name__ == '__main__':
    manager = Manager()

    d = manager.dict()
    l = manager.list(range(10))

    p = Process(target=f, args=(d, l))
    p.start()
    p.join()

    print d
    print l

Output:

bash-4.3$ python -i foo.py 
{0.25: None, 1: '1', '2': 2}
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> 

Note: Please be careful with the types of obejcts ou are sharing and attaching to your Process classes as you may end up with issues with pickling. See: Python multiprocessing pickling error

Community
  • 1
  • 1
James Mills
  • 18,669
  • 3
  • 49
  • 62
  • Renamed `listHolder` to `lHolder` as suggested, because of reserved keywords.We now have changed `class B` with the following: http://pastebin.com/CP6UP5DW Which results into: `cPickle.PicklingError: Can't pickle : attribute lookup __builtin__.function failed` – Martijn Nosyncerror May 23 '14 at 15:03
  • I feel you should simplify your code a bit and get the simplest thing working first. Your pickling problem is a different problem. See: http://stackoverflow.com/questions/8804830/python-multiprocessing-pickling-error – James Mills May 23 '14 at 15:25