9

I have a class, that loads all resources into memory that are needed for my application (mostly images).

Then several threads need to access these resources through this class. I don't want every instance to reload all resources, so I thought I use the Singleton Pattern. I did it like this:

class DataContainer(object):
    _instance = None
    _lock = threading.Lock()
    _initialised = True

    def __new__(cls, *args, **kwargs):
        with cls._lock:
            if not cls._instance:
                cls._initialised = False
                cls._instance = object.__new__(cls, *args, **kwargs)
        return cls._instance

    def __init__(self, map_name = None):

        # instance has already been created
        if self._initialised:
            return

        self._initialised = True

        # load images

This works fine, as long as I am not using multiple threads. But with multiple Threads every thread has a different instance. So using 4 threads, they each create a new instance. I want all threads to use the same instance of this class, so the resources are only loaded into memory once.

I also tried to do this in the same module where the class is defined, but outside the class definition:

def getDataContainer():
    global dataContainer
    return dataContainer

dataContainer = DataContainer()

but every thread still has its own instance.

I am new to python, if this is the wrong approach plz let me know, I appreciate any help

Kallz
  • 3,244
  • 1
  • 20
  • 38
user2078645
  • 189
  • 1
  • 1
  • 9
  • Your use of a Singleton seems appropriate. Please post the rest of the code of this class. I cannot see anything wrong with it so far. – Javier Feb 23 '14 at 20:30
  • 1
    For multithreaded you probably want to pass the same instance of the object to each thread. Failing that, you might need to register your instance globally. – Will Feb 23 '14 at 20:56
  • @Will That is exactly what I am trying to do ;) Can you point me in the right direction on how to do that? – user2078645 Feb 24 '14 at 20:29
  • Pass it in the constructor of each thread class but make sure to "lock" anything that might change. See the python threading module for details. It's a but too complicated to fit into one comment. – Will Feb 25 '14 at 13:41
  • Try doing all initialization in `__new__` while holding the lock. – Janne Karila Feb 26 '14 at 07:57
  • Thanks for the advice, I will test it as soon as I can and report back – user2078645 Feb 27 '14 at 18:38
  • @JanneKarila Tried your approach. Doing Initialization in the __new__ method doesn't seem to do the trick. No matter if I do dc = DataContainter() or use getDataContainer() every thread still has its on instance with an unique ID. – user2078645 Mar 02 '14 at 19:24
  • @Will I am using parallel python, so I dont create threads, but pass the tasks to threads like job_server.submit( ... ) . I can't find a way to pass an object through the submit method – user2078645 Mar 02 '14 at 19:25
  • So you are actually using multiple processes not threads. Please clarify the question. – Janne Karila Mar 02 '14 at 19:29
  • @JanneKarila, parallel python is more like celery. It handles the thread pools and just gives you an API to make remote batch calls. – Will Mar 03 '14 at 13:40
  • @user2078645 surely you pass your object as one of the args to the remote function? – Will Mar 03 '14 at 13:41

1 Answers1

14

To expand on @Will's comment, if a "shared object" is created by the parent, then passed in to each thread, all threads will share the same object.

(With processes, see the multiprocessing.Manager class, which directly support sharing state, including with modifications.)

import threading, time


class SharedObj(object):
    image = 'beer.jpg'


class DoWork(threading.Thread):
    def __init__(self, shared, *args, **kwargs):
        super(DoWork,self).__init__(*args, **kwargs)
        self.shared = shared

    def run(self):
        print threading.current_thread(), 'start'
        time.sleep(1)
        print 'shared', self.shared.image, id(self.shared)
        print threading.current_thread(), 'done'


myshared = SharedObj()
threads = [ DoWork(shared=myshared, name='a'), 
            DoWork(shared=myshared, name='b')
]
for t in threads:
    t.start()
for t in threads:
    t.join()
print 'DONE'

Output:

<DoWork(a, started 140381090318080)> start
<DoWork(b, started 140381006067456)> start
shared beer.jpg shared140381110335440
 <DoWork(b, started 140381006067456)> done
beer.jpg 140381110335440
<DoWork(a, started 140381090318080)> done
DONE

Note that the thread IDs are different, but they both use the same SharedObj instance, at memory address ending in 440.

johntellsall
  • 14,394
  • 4
  • 46
  • 40