0

When using BigQueryInsertJobOperator and setting the configuration to perform a dry run on a faulty .sql file/ a hardcoded query, the task succeeds even though it should fail. The same error gets properly thrown out by task failure when running with dryRun as false in configuration. Below is the code used for testing in composer(airflow)

from airflow.providers.google.cloud.operators.bigquery import BigQueryInsertJobOperator
from airflow import DAG

default_args = {
    'depends_on_past': False,
}


dag = DAG(dag_id='bq_script_tester',
          default_args=default_args,
          schedule_interval='@once',
          start_date=datetime(2021, 1, 1),
          tags=['bq_script_caller']
          )

with dag:
    job_id = BigQueryInsertJobOperator(
        task_id="bq_validator",
        configuration={
                "query": {
                    "query": "INSERT INTO `bigquery-public-data.stackoverflow.stackoverflow_posts` values('this is cool');",
                    "useLegacySql": False,
                },
            "dryRun": True
            },
        location="US"
        )

How can a bigquery be validated using dryRun option in composer. Is there an alternative approach in composer to achieve the same functionality. The alternative should be an operator capable of accepting sql scripts that contain procedures and simple sql with support of templating.

Airflow version: 2.1.4
Composer version: 1.17.7
Suga Raj
  • 481
  • 3
  • 15
  • What are you trying to get using dryRun? What is your composer image version? And your composer location? Have you checked [jobStatistics2](https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#jobstatistics2) there are some similar processing statistics that dryRun should bring? – Jose Gutierrez Paliza May 05 '22 at 18:38
  • @JoseGutierrezPaliza I edited my question with the version details. I am trying to achieve validation of a bigquery using composer even before the query is actually executed in bigquery env. you can safely assume this to be a validation api for bigquery. – Suga Raj May 06 '22 at 09:12
  • can you put `"dryrun"` below `"useLegacySql"` and inside the `"query"` brackets? – Jose Gutierrez Paliza May 06 '22 at 15:33
  • @JoseGutierrezPaliza `"dryRun"` inside the `"query"` block is doing a actual run instead of a dry run – Suga Raj May 09 '22 at 08:55

0 Answers0