1

While browsing I just came across Dataflow SQL. Is it any different from beamSQL?

miles212
  • 383
  • 3
  • 20

1 Answers1

3

Apache Beam SQL is a functionality of Apache Beam that allows you to execute queries directly from your pipeline.

As you can see here, Beam SQL has two options of SQL syntax: Beam Calcite SQL and Zeta SQL. The advantage of using Zeta SQL is that its very similar to BigQuery's syntax hence its useful in pipelines that read from or write to BigQuery.

Dataflow SQL is a functionality of Dataflow that allows you to create pipelines directly from a BigQuery query. It's said in the documentation that it supports the Zeta SQL syntax (BigQuery syntax).

To create a new Dataflow job through the BigQuery's console, to the following steps:

  1. Go to BigQuery's console
  2. Just under the Query editor, click in More and then in Query settings
  3. Select Cloud Dataflow engine in the first option as you can see below

enter image description here

After that, you can click in Create Cloud Dataflow job and your query will become a job in Dataflow.

I hope it helps

rmesteves
  • 3,870
  • 7
  • 23
  • What is a use case for using Dataflow SQL when there is BigQuery SQL..What would be more cost effective? I am not exploring using BigQuery Flexslots to save on costs of query. – Anant Bhandarkar Jun 28 '20 at 16:16