13

I would like to develop an app engine application that directly stream data into a BigQuery table.

According to Google's documentation there is a simple way to stream data into bigquery:

Here is the sample code snippet on how streaming insert should be coded:

body = {"rows":[
{"json": {"column_name":7.7,}}
]}

response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

Although I've downloaded the client api I didn't find any reference to a "bigquery" module/object referenced in the above Google's example.

Where is the the bigquery object (from snippet) should be located?

Can anyone show a more complete way to use this snippet (with the right imports)?

I've Been searching for that a lot and found documentation confusing and partial.

James
  • 153
  • 1
  • 7

2 Answers2

11

Minimal working (as long as you fill in the right ids for your project) example:

import httplib2
from apiclient import discovery
from oauth2client import appengine

_SCOPE = 'https://www.googleapis.com/auth/bigquery'

# Change the following 3 values:
PROJECT_ID = 'your_project'
DATASET_ID = 'your_dataset'
TABLE_ID = 'TestTable'


body = {"rows":[
    {"json": {"Col1":7,}}
]}

credentials = appengine.AppAssertionCredentials(scope=_SCOPE)
http = credentials.authorize(httplib2.Http())

bigquery = discovery.build('bigquery', 'v2', http=http)
response = bigquery.tabledata().insertAll(
   projectId=PROJECT_ID,
   datasetId=DATASET_ID,
   tableId=TABLE_ID,
   body=body).execute()

print response

As Jordan says: "Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset. Note that if you also want to use the robot to run queries, not just stream, you need the robot to be a member of the project 'team' so that it is authorized to run jobs."

Community
  • 1
  • 1
Felipe Hoffa
  • 54,922
  • 16
  • 151
  • 325
  • Thanks Fh. and @Jordan for the quick help – James Feb 26 '14 at 23:08
  • `appengine` doesn't seem to exist in the current (2.0.1) version of `oauth2client` but `from oauth2client.contrib import appengine` did the trick. – tx802 Apr 18 '16 at 15:17
  • Felipe and @Jordan Tigani Exactly which Member in the IAM page do you mean by 'appengine robot' ? There seem to be several members potentially associated with App Engine. Here are the ones I'm seeing: appspot.gserviceaccount.com (App Engine default service account), gae-api-prod.google.com.iam.gserviceaccount.com (App Engine Flexible Environment Service Agent), container-engine-robot.iam.gserviceaccount.com (Kubernetes Engine Service Agent), compute-system.iam.gserviceaccount.com (Compute Engine Service Agent), or developer.gserviceaccount.com (Compute Engine default service account) – KevinTydlacka Apr 19 '19 at 17:19
3

Here is a working code example from an appengine app that streams records to a BigQuery table. It is open source at code.google.com:

http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/main.py#124

To find out where the bigquery object comes from, see http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/config.py

Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset.

Note that if you also want to use the robot to run queries, not just stream, you need to robot to be a member of the project 'team' so that it is authorized to run jobs.

Jordan Tigani
  • 26,089
  • 4
  • 60
  • 63