0

I am running pyspark in local mode, and I need to connect to bigquery. I have found this: https://cloud.google.com/dataproc/docs/tutorials/bigquery-connector-spark-example but they focus on dataproc, and my spark is set up on a local machine.

Could someone please help me understand at a high level, in points, what exactly are the things I need to set up the connection and query the data into dataframes?

Thank you

osfor
  • 17
  • 4

1 Answers1

1

Posting this as a community wiki.

As per this SO post, you can connect pysparkto bigquery without using dataproc by running :

spark.read.format("bigquery").option("credentialsFile", "</path/to/key/file>").option("table", "<table>").load()
Joevanie
  • 489
  • 2
  • 5
  • what about the connectors? From which website would you download the jar? Could you share a specific link? – osfor Jul 19 '23 at 05:27