0

I am attempting to write a Python script to download and unzip hundreds of files from an AWS server. As I understand it, these tasks are I/O-bound tasks, so I would like to multi-thread this task to speed up processing times.

Since I am new to Python, I've been reading guides like this one and that one on multithreading and multiprocessing.

Both of the above links suggest code to import methods from the subprocess library, but I am running into trouble completing these imports. The second link above suggests the following code to illustrate multithreading:

from multiprocessing import Pool as ProcessPool

from urllib.request import urlopen

def run_tasks(function, args, pool, chunk_size=None):
    results = pool.map(function, args, chunk_size)
    return results

def work(n):    
    with urlopen("https://www.google.com/#{n}") as f:
        contents = f.read(32)
    return contents

if __name__ == '__main__':
    numbers = [x for x in range(1,100)]
    
    # Run the task using a thread pool
    
    t_p = ThreadPool()
    result = run_tasks(work, numbers, t_p)    
    print (result)

    t_p.close()

When I tried running this script, I got the following error with traceback:

PS C:\Users\USERNAME> & "C:/Users/USERNAME/AppData/Local/Continuum/anaconda3/python.exe" "h:/Post-Processing/API Query/Python Test/subprocess_test/subprocess.py"
Traceback (most recent call last):
  File "h:/Post-Processing/API Query/Python Test/subprocess_test/subprocess.py", line 38, in <module>
    t_p = ThreadPool()
  File "C:\Users\USERNAME\AppData\Local\Continuum\anaconda3\lib\multiprocessing\dummy\__init__.py", line 123, in Pool
    from ..pool import ThreadPool
  File "C:\Users\USERNAME\AppData\Local\Continuum\anaconda3\lib\multiprocessing\pool.py", line 26, in <module>
    from . import util
  File "C:\Users\USERNAME\AppData\Local\Continuum\anaconda3\lib\multiprocessing\util.py", line 17, in <module>
    from subprocess import _args_from_interpreter_flags
ImportError: cannot import name '_args_from_interpreter_flags' from 'subprocess' (h:\PSO Post-Processing\API Query\Python Test\subprocess_test\subprocess.py)

I found this SO thread, in which the answer suggests adding

from subprocess import _args_from_interpreter_flags

to the list of imports. However, when I added this line, the import error seems to shift into my current script:

Traceback (most recent call last):
  File "h:/Post-Processing/API Query/Python Test/subprocess_test/subprocess.py", line 20, in <module>
    from subprocess import _args_from_interpreter_flags
  File "h:\Post-Processing\API Query\Python Test\subprocess_test\subprocess.py", line 20, in <module>
    from subprocess import _args_from_interpreter_flags
ImportError: cannot import name '_args_from_interpreter_flags' from 'subprocess' (h:\PSO Post-Processing\API Query\Python Test\subprocess_test\subprocess.py)

I am now suspecting that something is wrong with my Python installation, but I am not sure how to troubleshoot it.

I am running Windows 10 on a work computer and using Visual Studio Code as my editor. According to Visual Studio Code, I'm running Python 3.7.6 64-bit ('Continuum': virtualenv). I found that I have subprocess.py installed at

"C:\Users\USER\AppData\Local\Continuum\anaconda3\Lib\subprocess.py"

and this subprocess.py file indeed has a segment with

def _args_from_interpreter_flags():
    """Return a list of command-line arguments reproducing the current
    settings in sys.flags, sys.warnoptions and sys._xoptions."""
    flag_opt_map = {
        'debug': 'd',
        # 'inspect': 'i',
        # 'interactive': 'i',
        'dont_write_bytecode': 'B',
        'no_site': 'S',
        'verbose': 'v',
        'bytes_warning': 'b',
        'quiet': 'q',
        # -O is handled in _optim_args_from_interpreter_flags()
    }
    args = _optim_args_from_interpreter_flags()
    for flag, opt in flag_opt_map.items():
        v = getattr(sys.flags, flag)
        if v > 0:
            args.append('-' + opt * v)

    if sys.flags.isolated:
        args.append('-I')
    else:
        if sys.flags.ignore_environment:
            args.append('-E')
        if sys.flags.no_user_site:
            args.append('-s')

    # -W options
    warnopts = sys.warnoptions[:]
    bytes_warning = sys.flags.bytes_warning
    xoptions = getattr(sys, '_xoptions', {})
    dev_mode = ('dev' in xoptions)

    if bytes_warning > 1:
        warnopts.remove("error::BytesWarning")
    elif bytes_warning:
        warnopts.remove("default::BytesWarning")
    if dev_mode:
        warnopts.remove('default')
    for opt in warnopts:
        args.append('-W' + opt)

    # -X options
    if dev_mode:
        args.extend(('-X', 'dev'))
    for opt in ('faulthandler', 'tracemalloc', 'importtime',
                'showalloccount', 'showrefcount', 'utf8'):
        if opt in xoptions:
            value = xoptions[opt]
            if value is True:
                arg = opt
            else:
                arg = '%s=%s' % (opt, value)
            args.extend(('-X', arg))

    return args

Given all this information, I am sure that I'm missing a simple detail that's stopping the threading code from working. I appreciate any help you can give.

Thank you!!

ODK
  • 41
  • 5
  • 1
    It looks like you have a file named `subprocess.py` in the current directory, and that one is being imported instead of the "real" one. – John Gordon Dec 06 '20 at 01:42
  • Oh thank you! I named the script I was running subprocess.py without realizing it was causing my very errors! Now the script works. Thank you, John Gordon! For my knowledge, if I ```import``` something, does Python first look in the script's directory, and then in the libraries folder if the first search fails? – ODK Dec 06 '20 at 03:47
  • Yes, the script's own directory is always first in the search path. – John Gordon Dec 06 '20 at 04:54

0 Answers0