2

I am new to Azure Machine Learning and have been struggling with importing modules into my run script. I am using the AzureML SDK for Python. I think I somehow have to append the script location to PYTHONPATH, but have been unable to do so.

To illustrate the problem, assume I have the following project directory:

project/
   src/
      utilities.py
      test.py
   run.py
   requirements.txt

I want to run test.py on a compute instance on AzureML and I submit the run via run.py. A simple version of run.py looks as follows:

from azureml.core import Workspace, Experiment, ScriptRunConfig
from azureml.core.compute import ComputeInstance
ws = Workspace.get(...) # my credentials here
env = Environment.from_pip_requirements(name='test-env', file_path='requirements.txt')
instance = ComputeInstance(ws, '<instance-name>')
config = ScriptRunConfig(source_directory='./src', script='test.py', environment=env, compute_target=instance)
run = exp.submit(config)
run.wait_for_completion()

Now, test.py imports functions from utilities.py, e.g.:

from src.utilities import test_func
test_func()

Then, when I submit a run, I get the error:

Traceback (most recent call last):
  File "src/test.py", line 13, in <module>
    from src.utilities import test_func
ModuleNotFoundError: No module named 'src.utilities'; 'src' is not a package

This looks like a standard error where the directory is not appended to the Python path. I tried two things to get rid of it:

  1. include an __init__.py file in src. This didn't work and I would also for various reasons prefer not to use __init__.py files anyways.
  2. fiddle with the environment_variables passed to AzureML like so env.environment_variables={'PYTHONPATH': f'./src:${{PYTHONPATH}}' but that didn't really work either and I assume that is simply not the correct way to append the PYTHONPATH

I would greatly appreciate any suggestions on extending PYTHONPATH or any other ways to import modules when running a script in AzureML.

1 Answers1

1

The source diectory set in ScriptRunConfig will automaticaly add to the PYTHONPATH, that means remove the "src" directory from the import line.

from utilities import test_func

Hope that helps

  • Yes, that is one solution. However, because I am also using the module locally in a variety of ways, I'd prefer to stay with absolute imports using src.utilities. I solved my problem in the meantime by prepending the src directory to the pythonpath using sys via `sys.path.insert(0,os.path.split(os.path.dirname(os.path.realpath(__file__)))[0])`. It works although I would love to know if there is a more elegant way. – user5211657 May 19 '22 at 19:18