1

While running the pipeline in azure devops for the specific job named Training the model having the the command

az ml run submit-script -g azure-devops-rg-1 -w mlops-wsp-1 -e cantypedetect --ct mlstudclust2 -d conda_requirements.yml -c train_data -t ../metadata/run.json ./train_test.py

train_test.py contains xgboost model for a specific data.

pipeline information:

agent running is ubuntu-latest

I have added python version (3.11.*), installed libraries, added ML CLI extension, workspaced creation in ml studio, created compute cluster, uploading data to default datastore and created the folders metadata, models for outputs, those jobs are all working fine except this training the model job.

Important thing: I am not using urlencode function at all in my train_test.py file.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
uma_8331
  • 89
  • 5

1 Answers1

0

There seems to be some kind of bug in either azure-cli, azureml-sdk, or both within the ADO pipelines agent.

I set the "Use Python version" task to Python 3.8, then I follow that with a Bash task that reinstalls those packages:

python --version

# remove buggy packages
sudo apt-get remove -y azure-cli
sudo apt-get remove -y azureml-sdk

# upgrade tools
python -m pip install --upgrade pip setuptools wheel

# reinstall packages
python -m pip install azure-cli
python -m pip install azureml-sdk

Then my Azure CLI step that runs my train script finally works:

az ml run submit-script # ...

Reading these issues helped me get to this workaround:

SamyIshak
  • 411
  • 5
  • 9