15

I'm doing a Django project and try to improve computing speed in backend.

The task is something like a CPU-bound conversion process

Here's my environment

  • Python 3.6.1
  • Django 1.10
  • PostgreSQL 9.6

And I stuck with following errors when I try to parallel a computing API by python multi-processing library.

  File "D:\\project\apps\converter\models\convert_manager.py", line 1, in <module>
    from apps.conversion.models import Conversion
  File "D:\\project\apps\conversion\models.py", line 5, in <module>
    class Conversion(models.Model):
  File "C:\\virtenv\lib\site-packages\django\db\models\base.py", line 105, in __new__
    app_config = apps.get_containing_app_config(module)
  File "C:\\virtenv\ib\site-packages\django\apps\registry.py", line 237, in get_containing_app_config
    self.check_apps_ready()
  File "C:\\lib\site-packages\django\apps\registry.py", line 124, in check_apps_ready
    raise AppRegistryNotReady("Apps aren't loaded yet.")

look like each process import Conversion model and Conversion model is like

from django.db import models


    Conversion(model.Model):

       conversion_name = models.CharField(max_length=63)
       conversion_user = models.CharField(max_length=31)
       conversion_description = models.TextField(blank=True)
       ...

Below is my sample function which I want to parallel, each iteration is independent but will access or insert data into SQL.

Class ConversionJob():
     ...

    def run(self, p_list):
        list_merge_result = []
        for p in p_list:
            list_result = self.Couputing_api(p)
            list_merge_result.extend(list_result)

and I'm try to do is

from multiprocessing import Pool


 Class ConversionJob():
         ...
        def run(self, p_list):
            list_merge_result = []

            p = Pool(process=4)
            list_result = p.map(self.couputing_api, p_list)
            list_merge_result.extend(list_result)

In computing_api(), it'll try to get current conversion's info which has completed and save into SQL before this api call, but this caused the error.

My question is

  • Why import Conversion model will caused Apps aren't loaded yet errors, I had google lots of article but not actually solve my problems.
  • I can see each Process SpawnPoolWorker-x generated and try to boot django server again(why?), each worker will stop at same errors.

  • computing API will try to access sql , I haven't think about how to deal with this work. (share db connections or create new connection in each process)

codebrew
  • 329
  • 2
  • 13
  • Don't do this. Use something like Celery to run offline jobs. – Daniel Roseman Oct 24 '17 at 11:16
  • May I ask what problems if I use multi-processing? I tried some simple math computing in django, it's work, but when I try to call another modules, it'll fail and stop with this error. – codebrew Oct 24 '17 at 13:12
  • I know this is unusual for a web applications(synchronous execution ), but suppose I only want to install apps in local machine(127.0.0.1) and not going to host as web server. Is there any way to speed up computing task by multi-processing? – codebrew Oct 24 '17 at 17:12

2 Answers2

25

For others that might stumble upon this in future:

If you encounter this issue while running Python 3.8 and trying to use multiprocessing package, chances are that it is due to the sub processed are 'spawned' instead of 'forked'. This is a change with Python 3.8 on Mac OS where the default process start method is changed from 'fork' to 'spawn'. This is a known issue with Django.

To get around it:

import multiprocessing as mp
mp.set_start_method('fork')
Oscar Chen
  • 559
  • 5
  • 11
  • 1
    The default depends on the operating system. According to the docs, it is only on MacOS that the default changed with 3.8 and fork is considered unsafe there. "On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725." – Paul Prescod Apr 16 '21 at 18:16
  • What am I missing here? The doc I'm looking at says the default start method is spawn on both Windows and Mac OS: https://docs.python.org/3.8/library/multiprocessing.html – Oscar Chen Apr 17 '21 at 03:36
  • I guess there are two things. 1. There are 3 main operating systems. So the comment "default process start method is 'spawn'." only applies to 2 of the 3. 2. The default on Windows was ALWAYS spawn. So the statement "This is a change with Python 3.8 where the default process start method is changed from 'fork' to 'spawn'" only applies to 1 of the 3 operating systems. – Paul Prescod Apr 19 '21 at 16:25
8

This post can solve the problem.

Django upgrading to 1.9 error “AppRegistryNotReady: Apps aren't loaded yet.”

I had found this answer before, but not actually solve my problems at that time.

After I repeated test, I have to add these codes before import another model, otherwise, child-process will booting failed and give the error.

import django
django.setup()
from another.app import models
codebrew
  • 329
  • 2
  • 13