Question
How do I get a simple 'bigrquery' function using public data to work? I just want to get ANY 'bigrquery' function to return data from Google Cloud Platform Big Query.
Background
Example followed:
I started by looking at this example from bigrquery as a starting point (I already have the package installed):
library(bigrquery)
billing <- bq_test_project() # replace this with your project ID
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"
tb <- bq_project_query(billing, sql)
bq_table_download(tb, n_max = 10)
#> First chunk includes all requested rows.
#> # A tibble: 10 x 4
#> year month day weight_pounds
#> <int> <int> <int> <dbl>
#> 1 1969 1 20 7.00
#> 2 1969 1 27 7.69
#> 3 1969 6 19 6.75
#> 4 1969 5 30 6.19
#> 5 1969 11 9 7.87
#> 6 1969 5 25 7.06
#> 7 1969 7 25 7.94
#> 8 1969 9 11 7.06
#> 9 1969 7 13 6.00
#> 10 1969 9 27 8.13
Project ID
I know that billing
needs to be a project ID associated with my account on GCP, so I navigated per the instructions under "Billing project" from one of my own test projects to get project ID:
Attempt to follow example
So here's what I did:
library(bigrquery)
bq_auth(email = "my-google-account@gmail.com") # Not a step listed,
# but I saw this elsewhere and thought it might be necessary.
billing <- "tidy-XXXXX" # my project ID from the screenshot above
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"
tb <- bq_project_query(billing, sql) # Cannot get past this point without Error
bq_table_download(tb, n_max = 10)
Running bq_auth()
caused my browser to activate and have me verify myself in the browser once, but not on subsequent runs (which is the expected behavior, I think).
Running bq_project_query()
gets me this:
> tb <- bq_project_query(billing, sql)
Error in `signal_reason()`:
! Request had insufficient authentication scopes. [insufficientPermissions]
Run `rlang::last_error()` to see where the error occurred.
> last_error()
<error/bigrquery_insufficientPermissions>
Error in `signal_reason()`:
! Request had insufficient authentication scopes. [insufficientPermissions]
---
Backtrace:
1. bigrquery::bq_project_query(billing, sql)
2. bigrquery::bq_perform_query(...)
3. bigrquery:::bq_post(url, body = bq_body(body, ...), query = list(fields = "jobReference"))
4. bigrquery:::process_request(req)
5. bigrquery:::bq_check_response(status, type, content)
6. bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)
The SQL query is good as it works in the GCP console. Under "Authentication and authorization" I see a reference to cache inside a folder ~/.R/gargle/gargle-oauth/
, but I'm on Windows and not finding that file/folder, but under "Details", it seems I shouldn't need to worry about email or bq_auth()
at all. I don't have or don't think I need "API credentials" because I'm not writing an application -- I'm just using this in RStudio interactively (well, .R scripts -- but that's the same, right?). I'm not trying to use a service account (should I? I thought my user account would be sufficient).
I'd like to see what is happening in the backtrace in bigrquery:::bq_check_response(status, type, content)
, but I don't know how to step into that particular code (I can step through code I write myself -- but not a package function?)
I actually have a work use 'bigrquery', but unless I can get this figured out, I'll never get that working. I feel like I'm missing something obvious, because I don't see this issue popping up in the SO questions that I've already searched.
UPDATE 6/22/22
Maybe progress? I feel that this token should not be NULL
below, but still not sure how to fix it. First, I
library(bigrquery)
options(error = recover) # ADDED TO ALLOW BROWSING OF CALL FRAMES
bq_auth(email = "my-google-account@gmail.com")
billing <- "tidy-XXXXXX"
# Error comes on next line:
tb <- bq_project_query(x = billing,
"SELECT count(*) FROM publicdata.samples.natality"
)
The error I get is the same as earlier in the post, but options(error = recover)
lets me access the frames:
> library(bigrquery)
> options(error = recover)
> bq_auth(email = "my-google-account@gmail.com")
> billing <- "tidy-XXXXXX"
> tb <- bq_project_query(
+ x = billing,
+ "SELECT count(*) FROM publicdata.samples.natality"
+ )
Error in `signal_reason()`:
! Request had insufficient authentication scopes. [insufficientPermissions]
Run `rlang::last_error()` to see where the error occurred.
Enter a frame number, or 0 to exit
1: source("C:/SO_QuestionScript.R")
2: withVisible(eval(ei, envir))
3: eval(ei, envir)
4: eval(ei, envir)
5: GoogleCloudPlatformQuery_Stack_Overflow_Question.R#9: bq_project_query(x = billing, "SELECT count(*) FROM publicdata.samples.natality")
6: bq_perform_query(query, billing = x, destination_table = destination_table, ...)
7: bq_post(url, body = bq_body(body, ...), query = list(fields = "jobReference"))
8: process_request(req)
9: bq_check_response(status, type, content)
10: signal_reason(json$error$errors[[1]]$reason, json$error$message)
11: rlang::abort(message, class = paste0("bigrquery_", reason))
12: signal_abort(cnd, .file)
Selection: 7
Called from: eval(substitute(browser(skipCalls = skip), list(skip = 7 - which)),
envir = sys.frame(which))
Browse[1]> dput(token)
structure(list(method = NULL, url = NULL, headers = NULL, fields = NULL,
options = list(), auth_token = <environment>, output = NULL), class = "request")
Browse[1]>
If I dig a little more on the internals of bq_post()
, I can see that the req
variable is giving me the status code 403, which is my problem:
Browse[1]> bq_post
function (url, body, ..., query = NULL, token = bq_token())
{
json <- jsonlite::toJSON(body, pretty = TRUE, auto_unbox = TRUE)
req <- POST(paste0(base_url, url), body = json, httr::user_agent(bq_ua()),
token, add_headers(`Content-Type` = "application/json"),
..., query = prepare_bq_query(query))
invisible(process_request(req))
}
Browse[1]> req$status_code
[1] 403
Browse[1]> req$request
<request>
POST https://www.googleapis.com/bigquery/v2/projects/tidy-orbit-173301/jobs/?fields=jobReference
Output: write_memory
Options:
* post: TRUE
* postfieldsize: 177
* postfields: as.raw(c(0x7b, 0x0a, 0x20, 0x20, 0x22, 0x63, 0x6f, 0x6e, 0x66, 0x69, 0x67, 0x75, 0x72, 0x61, 0x74, 0x69, 0x6f, 0x6e, 0x22, 0x3a, 0x20, 0x7b, 0x0a, 0x20, 0x20, 0x20, 0x20, 0x22, 0x71, 0x75, 0x65, 0x72, 0x79, 0x22, 0x3a, 0x20, 0x7b, 0x0a, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x22, 0x71, 0x75, 0x65, 0x72, 0x79, 0x22, 0x3a, 0x20, 0x22, 0x53, 0x45, 0x4c, 0x45, 0x43, 0x54, 0x20, 0x63, 0x6f, 0x75, 0x6e, 0x74, 0x28, 0x2a, 0x29, 0x20, 0x46, 0x52, 0x4f, 0x4d, 0x20, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0x64,
0x61, 0x74, 0x61, 0x2e, 0x73, 0x61, 0x6d, 0x70, 0x6c, 0x65, 0x73, 0x2e, 0x6e, 0x61, 0x74, 0x61, 0x6c, 0x69, 0x74, 0x79, 0x22, 0x2c, 0x0a, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x22, 0x75, 0x73, 0x65, 0x4c, 0x65, 0x67, 0x61, 0x63, 0x79, 0x53, 0x71, 0x6c, 0x22, 0x3a, 0x20, 0x66, 0x61, 0x6c, 0x73, 0x65, 0x2c, 0x0a, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x22, 0x70, 0x72, 0x69, 0x6f, 0x72, 0x69, 0x74, 0x79, 0x22, 0x3a, 0x20, 0x22, 0x49, 0x4e, 0x54, 0x45, 0x52, 0x41, 0x43, 0x54, 0x49, 0x56, 0x45, 0x22, 0x0a,
0x20, 0x20, 0x20, 0x20, 0x7d, 0x0a, 0x20, 0x20, 0x7d, 0x0a, 0x7d))
* useragent: bigrquery/1.4.0 (GPN:RStudio; ) gargle/1.2.0 httr/1.4.2
Headers:
* Accept: application/json, text/xml, application/xml, */*
* Content-Type: application/json
* Authorization: Bearer ya29.a0ARrdaM_Z0AIQ7950HcZ87goJJd8NorjH8m5JJ3BTTjj3hwAxH_PrXlTfWKjbYLXx-OfCiIk2BwAtN5trJh7FHf--[MORE STRING YOU DON'T NEED]
Auth token: Gargle2.0
The Authorization: Bearer
looks like I have a token, but maybe it's not set up correctly?