4

It all began last night when I was making a script that required 8 or so packages including pygame.mixer which on my computer importing this takes a few seconds.

This meant that before the script even started I had to wait 10 or so seconds for all the imports to load. Because I want the script to obviously be as fast as possible could I start running the script while getting the imports with something like this:

import threading


def import_modules():
    import tkinter as tk
    from pygame import mixer
    import json
    import webbrowser
    print('imports finished')

a = threading.Thread(target=import_modules)
a.start()
for i in range(10000):
    print('Getting Modules')

So my question is:

Is this considered bad practice and will it cause problems?

If so are there alternatives I could use?

Or is it OK to do this?

Xantium
  • 11,201
  • 10
  • 62
  • 89
  • While you can perform imports in another thread, it probably won't achieve what you want to achieve. – user2357112 Oct 11 '17 at 23:06
  • @user2357112 Why? It does seem to speed up the imports. – Xantium Oct 11 '17 at 23:07
  • 3
    What do you mean by "start running the script"? If the script needs the "8 or so packages", how can it start running before those have been imported? – user4815162342 Oct 11 '17 at 23:08
  • @user4815162342 That's the thing. I could get the packages I need initially by importing the normal way then get the others as the script runs and by the time one of those packages are needed they will have already loaded. – Xantium Oct 11 '17 at 23:12

2 Answers2

3

If you are using CPython, this might not yield as much improvement as you'd expect.

CPython has a Global Interpreter Lock ("GIL") that ensures that only one thread at a time can be executing Python bytecode.

So whenever the import thread is executing Python code, the other thread is not running. The GIL is released by a thread when it is e.g. waiting on I/O. So there will be some time savings because of that.

There is a difference of opinion as to whether tkinter is truly thread-safe. It is still considered wise to run the tkinter main loop in the original thread, and to not invoke tkinter calls from other threads, because that can lead to crashes.

The GIL also can cause problems for GUI programs. If you are using a second thread for a long-running calculation, the user interface might become less responsive. There are at least two possible solutions. The first one is to split the long-running calculation up into small pieces which are each executed by a after method. The second is to run the calculation in a different process.


Follow-up questions from the comments:

is there anything else to speed up execution time?

The first thing you must to do is measure; what exactly causes the problem. Then you can look into the problem areas and try to improve them.

For example module load times. Run your app under a profiler to see how long the module loads take and why.

If pygame.mixer takes too long to load, you could use your platform's native mixer. UNIX-like operating systems generally have a /dev/mixer device, while ms-windows has different API's for it. Using those definitely won't take 10 seconds. There is a cost associated with this: you will loose portability between operating systems.

What are the alternatives

Using multiple cores is a usual tactic to try and speed things up. Currently on CPython the only general way get code to run in parallel on multiple cores is with multiprocessing or concurrent.futures.

However it depends on the nature of your problem if this tactic can work.

If your problem involves doing the same calculations over a huge set of data, that is relatively easy to parallelize. In that case you can expect a maximal speedup roughly equivalent to the numbers of cores you use.

It could be that your problem consists of multiple steps, each of which depends on the result of a previous step. Such problems are serial in nature and are much harder to execute in parallel.

Other ways to possible speed things up could be to use another Python implementation like Pypy. Or you could use cython together with type hints to convert performance-critical parts to compiled C code.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Beyond the GIL, there is a [global lock protecting module imports as well](https://stackoverflow.com/q/12389526/364696), so threading imports is doubly pointless. – ShadowRanger Oct 11 '17 at 23:25
  • @ShadowRanger Is this still the case in current Python 3 versions? Somewhere between 3.1 and 3.6 the mention of import in threaded code seems to have disappeared from the threading chapter of the Python documentation. – Roland Smith Oct 11 '17 at 23:32
  • Ok This seems like a bad idea with Tkinter what about the other modules (not GUI) will they be affected? What are the alternatives (if any) Everything except the standard library calls could be `pyc` files is there anything else to speed up execution time? – Xantium Oct 11 '17 at 23:37
  • @RolandSmith: Ah, yes, forgot that [in 3.3 they introduced per module import locks](https://docs.python.org/3/whatsnew/3.3.html#a-finer-grained-import-lock). So just blocked by the GIL; the import locks would only be a problem if you imported the same module in multiple threads (which can be easier to do than you would think, since most imports trigger imports of dependent modules). – ShadowRanger Oct 12 '17 at 00:42
  • @RolandSmith I'm sorry I think I might have been unclear. I mean it takes about 10 seconds to load **all** the 8 modules to load. `pygame.mixer` does take the longest though. I will get measurements for it as soon as I can. – Xantium Oct 12 '17 at 10:07
  • Right I have measured but my readings are strange. 27.904929006649144, 24.359547216813752, 2.2433278654516133, 2.302499898412886, 2.1275044430994545, 2.0166949328348793 and 1.8994527128036118. I have not made an error the result certainly was 27.904929006649144 and 24.359547216813752. It suddenly goes down afterwards (as you can see) any ideas what is causing this? – Xantium Oct 12 '17 at 21:13
  • 1
    @Simon That is probably an effect of your operating system's cache warming up. Most operating systems today use free memory as a disk cache, because disk is much slower than RAM. After a couple of tries, all your modules files will be in that disk cache, where they can be accessed quicker. :-)If you were to reboot your PC, you would probably see the same thing again. The operating system has other influences as well. The general concensus seems to be that disk operations on ms-windows are significantly slower than on e.g. Linux. – Roland Smith Oct 13 '17 at 02:41
  • OK that sounds plausible. My computer is quite old and I do run Windows. Thank you (for answering and your patience when faced with a slow learner) I will look at the answer you provided and work out a solution from there. ; ) – Xantium Oct 13 '17 at 08:34
3

I understand this is an old thread but i was looking for a way to minimize the loading time of my application, and wanted the user to see the gui so he can interact with it while other module being imported in background

i have read some answers suggesting a lazy import techniques, which i found complicated "for me", then i stumbled here with a suggest to use threading to import modules in background, then i gave it a shot, and found out it is the most brilliant idea that fits my needs

below is a code for an example gui application using PySimpleGUI which ask the user to enter a url and it will open it in the default browser window, the only module required to do so is webbrowser, so this job could be done while other modules loading

I added comments in this code to explain mostly all parts, hope it will help someone, tested on python 3.6, windows10.

please note: this is just a dummy code as a showcase.

# import essentials first
import PySimpleGUI as sg
import time, threading

# global variable names to reference to the imported modules, this way will 
# solve the problem of importing inside a function local namespace
pg = None
js = None
wb = None

progress = 0  # for our progress bar

def importer():
    # we will simulate a time consuming modules by time.sleep()
    global progress
    progress = 10
    start = time.time()
    global pg, js, wb
    import pygame as pg
    time.sleep(3)
    print(f'done importing pygame mixer in {time.time()-start} seconds')
    progress = 40

    start = time.time()
    import webbrowser as wb
    time.sleep(2)
    print(f'done importing webbrowser in {time.time()-start} seconds')
    progress = 70

    start = time.time()
    import json as js
    time.sleep(10)
    print(f'done importing json in {time.time()-start} seconds')

    progress = 100
    print('imports finished')

# start our importer in a separate thread
threading.Thread(target=importer).start()

# main app 
def main():
    # window layout
    layout = [[sg.Text('Enter url:', size=(15,1)), sg.Input(default_text='https://google.com', size=(31, 1), key='url')],
            [sg.Text('Loading modules:', size=(15,1), key='status'), 
            sg.ProgressBar(max_value=100, orientation='horizontal', size=(20,10), key='progress')],
            [sg.Button('Open url', disabled=True, key='open_url'), sg.Button('joysticks', disabled=True, key='joysticks'), sg.Cancel()]]
    window = sg.Window('test application for lazy imports', layout=layout) # our window

    while True:  # main application loop
        event, values = window.Read(timeout=10)  # non blocking read from our gui
        if event in [None, 'Cancel']:
            window.Close()
            break 

        elif event == 'open_url':
            wb.open(values['url'])
            print('helllooooooooooooo')

        elif event == 'joysticks':
            # show joystics number currently connected
            pg.init()
            n = pg.joystick.get_count()  # Get count of joysticks
            sg.Popup(f'joysticks number currently connected to computer = {n}')

        # open url button is disabled by default and will be enabled once our webbrowser module imported
        if wb:
            window.Element('open_url').Update(disabled= False)

        if pg:
            window.Element('joysticks').Update(disabled= False)

        # progress bar
        window.Element('progress').UpdateBar(progress)
        if progress >= 100:
            window.Element('status').Update('Loading completed', background_color='green')

main()
Mahmoud Elshahat
  • 1,873
  • 10
  • 24