1

How can I connect Spark to Google's BigQuery?

I imagine that one could use Spark's JDBC functionality to communicate with BigQuery.

But the only JDBC driver I found starschema is old.

If the answer involves JDBC what should the url parameter look like?

From Spark Docs:

  rdd.toDF.write.format("jdbc").options(Map(
    "url" -> "jdbc:postgresql:dbserver",
    "dbtable" -> "schema.tablename"
  ))
BAR
  • 15,909
  • 27
  • 97
  • 185
  • have you checked Simba? just asking – Mikhail Berlyant Oct 03 '15 at 00:34
  • @MikhailBerlyant from their website: "Simba’s Apache Hive Drivers efficiently transform an application’s SQL query into the equivalent form in HiveQL." Doesn't this replace what Spark already does? – BAR Oct 03 '15 at 02:06

1 Answers1

2

You can use the BigQuery connector for Hadoop (which also works for Spark): https://cloud.google.com/hadoop/bigquery-connector

If you use Google Cloud Dataproc (https://cloud.google.com/dataproc/) to deploy your Spark cluster, the BigQuery connector (as well as the GCS connector) will be automatically deployed and configured for you out of the box.

But you can also add the connector to an existing Spark deployment, whether it runs on Google Cloud or anywhere else. If your cluster is not deployed on Google Cloud then you'll have to configure the authentication yourself (using service-account "keyfile" authentication).

[Added] The answer to this other question (Dataproc + BigQuery examples - any available?) provides an example of using BigQuery from Spark.

Community
  • 1
  • 1
  • What should the url be? Are there any other relevant Spark config options? The Hadoop BigQuery connector shows only Hadoop related examples - nothing of Spark. – BAR Oct 03 '15 at 21:36
  • @William Vambenepe - have you got an example of Spark<->BigQuery using Dataproc by any chance? – Graham Polley Oct 06 '15 at 09:50
  • I've updated my answer to provide a link to another question which has been answered with code example of using BigQuery from Spark. – William Vambenepe Oct 10 '15 at 05:06