2

I am trying to set up a ML pipeline on Azure ML using the Python SDK. I have scripted the creation of a custom environment from a DockerFile as follows

from azureml.core import Environment
from azureml.core.environment import ImageBuildDetails
from other_modules import workspace, env_name, dockerfile

custom_env : Environment = Environment.from_dockerfile(name=env_name, dockerfile=dockerfile)
                      
custom_env.register(workspace=workspace)

build : ImageBuildDetails = custom_env.build(workspace=workspace)

build.wait_for_completion()

However, the ImageBuildDetails object that the build method returns invariably times out while executing the last wait_for_completion() line, ... likely due to network constraints that I cannot change.

So, how can I possibly check the build status via the SDK in a way that doesn't exclusively depend on the returned ImageBuildDetails object?

Rafael Sanchez
  • 394
  • 7
  • 20

1 Answers1

2

My first suggestion would be to use:

build.wait_for_completion(show_output=True)

This will help you debug better rather than assuming you have network issues, as the images can take quite a long time to build, and from my experience creating environments it's very likely you may have an issue with related to your Dockerfile.

A good alternative option is to build your docker image locally and optionally push it to the container registry associated with the workspace:

from azureml.core import Environment
myenv = Environment(name="myenv")
registered_env = myenv.register(workspace)
registered_env.build_local(workspace, useDocker=True, pushImageToWorkspaceAcr=True)

However another preferred method is to create an environment object from an environment specification YAML file:

from_conda_specification(name, file_path)

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-environments#use-conda-dependencies-or-pip-requirements-files

This should return the Environment and to verify it has been created:

for name,env in ws.environments.items():
    print("Name {} \t version {}".format(name,env.version))

restored_environment = Environment.get(workspace=ws,name="myenv",version="1")

print("Attributes of restored environment")
restored_environment
nferreira78
  • 1,013
  • 4
  • 17