2

I have a piece of code that is constantly creating new instances of class Car. In doing so, class Car is creating a list of instances of itself so when I want to get the info of the current instances, I can easily do so, like in the below code:

from multiprocessing import Process
import time
class Car:

    car_list = list()
    def __init__(self, id, model):
        self.id = id
        self.model = model
        Car.car_list.append(self)

    @classmethod
    def get_current_instances(cls):
        return Car.car_list


class Interface:

    def print_current_system(self):
        while True:
            print(len(Car.get_current_instances()))
            time.sleep(1)



if __name__ == "__main__":

    interface = Interface()
    model = ["Toyota", "BMW"]

    [Car(i, model[i]) for i in range(len(model))]

    print_process = Process(target=interface.print_current_system)
    print_process.start()

    Car(2345, "Tesla")
    print("from main process " + str(len(Car.get_current_instances()))) 

This code is simplified for the purpose of the question. However, the problem still remains the same. I am invoking a function print_current_system from a new process. This function is constantly looking at all the current instances of Car and prints the number of cars.

When I start this process, and then, later on, add some new instances of Car, these instances are hidden to the other child process while are perfectly visible to the main process. I am pretty sure I need to use something like Queue or Pipe. However, I am not sure how. This is the output of the above code:

2
from main process 3
2
2
2
2
2
bcsta
  • 1,963
  • 3
  • 22
  • 61
  • memory is not shared between processes, You need to syncrhonize it through a queue for example – Netwave Nov 29 '18 at 14:06
  • @Netwave but how will that work? do I have to put all the instances of Car in the queue? Plus when I invoke queue.get() that decreases the size of the queue by one right? I am not seeing how this would work. – bcsta Nov 29 '18 at 14:13
  • yes, instead of pushing to a list you need to push to a queue the number of processes you have so all of them can use it. – Netwave Nov 29 '18 at 14:14

2 Answers2

0

Background: Because Python is, by nature, single threaded (the interpreter is guarded by the GIL, or global interpreter lock), there are not true threads in it. Instead, to achieve the same effect, you have to use different processes, as you are doing in your example. Because these are different processes, with different interpreters and different namespaces, you will not be able to access normal data in one process from a different process. When you create the new process, the python interpreter forks itself and makes a copy of all Python objects, so Car.car_list is now two different objects, one in each process. So when one process adds to that list, it is adding to a different list than the other process is reading.

Answer: your hunch was correct: you will want to use one of the data structures in the multiprocessing module. These data structures are specially written to be "thread safe" (I guess actually "process safe" in this context) and to marshal the shared data between the two processes behind the scenes.

Example: you could use a global queue in which the "producer" process adds items and the "consumer" process removes them and adds them to its own copy of the list.

from multiprocessing import Queue

class Car:

    global_queue = Queue()
    _car_list = [] # This member will be up-to-date in the producer
                   # process. In that process, access it directly.
                   # In the consumer process, call get_car_list instead.
                   # This can be wrapped in an interface which knows
                   # which process it is in, so the calling code does
                   # not have to keep track.

    def __init__(self, id, model):
        self.id = id
        self.model = model
        self.global_queue.put(self)
        self._car_list.append(self)

    @classmethod
    def get_car_list(cls):
        """ Get the car list for the consumer process

            Note: do not call this from the producer process
        """
        # Before returning the car list, pull all pending cars off the queue
        # while cls.global_queue.qsize() > 0:
        # qsize is not implemented on some unix systems
        while not cls.global_queue.empty():
            cls._car_list.append(cls.global_queue.get())
        return cls._car_list

Note: with the above code, you can only have one consumer of the data. If the other processes call the get_car_list method, they will remove the pending cars from the queue and the consumer process won't receive them. If you need to have multiple consumer processes, you will need to take a different approach.

JimPri
  • 1,278
  • 8
  • 17
  • I only need one consumer from its own thread. However I don't need to, occasionally get a list of cars from the main process. Maybe that is not an issue since I don't need a queue, in that case, I can just get the car list from a different class method, right? – bcsta Nov 29 '18 at 14:27
  • Yes, you could get the car list from either process. I will update my example to make this clear. – JimPri Nov 29 '18 at 14:28
  • Oh and I think you meant to write Car.global_queue.put(self), right? instead of self.global_queue.put(self). – bcsta Nov 29 '18 at 14:28
  • Both are valid since global_queue is a class-level member. Using 'self' is more maintainable if the name of the class changes (similar to how I use 'cls' in the get_car_list method), but you might find using "Car._car_list" is more readable. It's a style choice. – JimPri Nov 29 '18 at 14:33
  • I tried what you suggested but it is giving an error: return self._maxsize - self._sem._semlock._get_value() NotImplementedError. The error line is 'while cls.global_queue.qsize() >0:' is that the right way of getting queue size? https://stackoverflow.com/questions/41952413/get-length-of-manager-queue-in-pythons-multiprocessing-library – bcsta Nov 29 '18 at 14:38
  • That is indeed the correct usage. It seems you are hitting this bug, which is present on mac OS: https://github.com/vterron/lemon/issues/11 – JimPri Nov 29 '18 at 14:44
  • This is well documented here: https://docs.python.org/3/library/multiprocessing.html It looks like you can use "not global_queue.empty()" instead. – JimPri Nov 29 '18 at 14:46
  • one last question. what If I have multiple producers? it would still work, right? – bcsta Nov 29 '18 at 15:05
  • It will work from the consumer's perspective. Each producer will only have a list of the cars that it produced, not a global list. – JimPri Nov 29 '18 at 15:12
0

If all what you want is keep a counting about how many cars you have, you can use a shared memory object like Value.

You can achive what you want with just a few changes to your code:

from multiprocessing import Process, Value
import time

class Car:

    car_list = list()
    car_quantity = Value('i', 0)     # Use a shared memory object here.

    def __init__(self, id, model):
        self.id = id
        self.model = model
        Car.car_list.append(self)
        Car.car_quantity.value += 1  # Update quantity

    @classmethod
    def get_current_instances(cls):
        return Car.car_list


class Interface:

    def print_current_system(self):
        while True:
            print(Car.car_quantity.value)  # Just print the value of the memory shared object (Value).
            time.sleep(1)



if __name__ == "__main__":

    interface = Interface()
    model = ["Toyota", "BMW"]

    [Car(i, model[i]) for i in range(len(model))]

    print_process = Process(target=interface.print_current_system)
    print_process.start()

    time.sleep(3)   # Added here in order you can see the 
                    # ouptut changing from 2 to 3.

    Car(2345, "Tesla")
    print("from main process " + str(len(Car.get_current_instances()))) 

Output:

2
2
2
from main process 3
3
3
3
3
3
Raydel Miranda
  • 13,825
  • 3
  • 38
  • 60