0

I'm using astro-cli for running airflow I have folder structure as follows: astro-cli:

dags(dir)
include(dir)
test.yaml
airflow_settings.yaml
Dockerfile

when I fire this command in my airflow dag

cmd_dataprep = az ml job create -f test.yaml --resource-group DefaultResourceGroup-EUS2 --workspace-name test-airflow

I'm getting error

[2023-01-15, 14:29:57 UTC] {subprocess.py:75} INFO - Running command: ['/bin/bash', '-c', 'az ml job create -f test.yaml --resource-group DefaultResourceGroup-EUS2 --workspace-name test-airflow']
[2023-01-15, 14:29:57 UTC] {subprocess.py:86} INFO - Output:
[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - ERROR: [31m
[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - 
[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - Error: The yaml file you provided does not match the prescribed schema for General yaml files and/or has the following issues:
[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - 
[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - 1) One or more files or folders do not exist.

[2023-01-15, 14:29:59 UTC] {subprocess.py:93} INFO - No such file or directory: test.yaml

What am I missing?

Edit: dag

task_dataprep = BashOperator(
task_id='run_dataprep',
bash_command=cmd_dataprep,
dag=dag)
Hussein Awala
  • 4,285
  • 2
  • 9
  • 23

1 Answers1

1

It's recommended to provide the absolute path of your files

cmd_dataprep = az ml job create -f /path/to/test.yaml --resource-group DefaultResourceGroup-EUS2 --workspace-name test-airflow

And you need to ensure that your file is accessible from all Airflow workers if you are using Celery or Kubernetes executors.

Hussein Awala
  • 4,285
  • 2
  • 9
  • 23
  • Adding to that: by default the BashOperator will execute its commands in a temporary directory (something like `/tmp/airflowtmp51vqpgmm`). So you might also need to set the `cwd` parameter of the BashOperator to your root directory. – TJaniF Jan 16 '23 at 21:28