I am trying to schedule a read and write Cloud Function in GCP, but I keep getting a fail on the execution of the scheduling in Cloud Scheduler. My function (which b.t.w. is validated and activated by Cloud Functions) is given by
def geopos_test(request):
from flatten_json import flatten
import requests
import flatten_json
import pandas as pd
import os, json, sys,glob,pathlib
import seaborn as sns
from scipy import stats
import collections
try:
collectionsAbc = collections.abc
except AttributeError:
collectionsAbc = collections
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.ticker as ticker
import datetime
import seaborn as sns
from mpl_toolkits.axes_grid1 import make_axes_locatable
from matplotlib.colors import ListedColormap, LinearSegmentedColormap
from matplotlib.lines import Line2D
import numpy as np
import math
from pandas.io.json import json_normalize
from operator import attrgetter
from datetime import date, timedelta
import pandas_gbq
import collections
from google.cloud import bigquery
client = bigquery.Client()
project = "<ProjectId>"
dataset_id = "<DataSet>"
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table('sectional_accuracy')
table = client.get_table(table_ref)
Sectional_accuracy = client.list_rows(table).to_dataframe()
sectional_accuracy = sectional_accuracy.drop_duplicates()
sectional_accuracy.sort_values(['Store'])
job_config = bigquery.LoadJobConfig(
schema=[
bigquery.SchemaField("Store", bigquery.enums.SqlTypeNames.STRING),
bigquery.SchemaField("storeid", bigquery.enums.SqlTypeNames.STRING),
bigquery.SchemaField("storeIdstr", bigquery.enums.SqlTypeNames.STRING),
bigquery.SchemaField("Date", bigquery.enums.SqlTypeNames.TIMESTAMP),
bigquery.SchemaField("Sections", bigquery.enums.SqlTypeNames.STRING),
bigquery.SchemaField("Percentage", bigquery.enums.SqlTypeNames.FLOAT),
bigquery.SchemaField("devicePercentage", bigquery.enums.SqlTypeNames.FLOAT),
bigquery.SchemaField("distance", bigquery.enums.SqlTypeNames.STRING),
],)
NtableId = '<ProjectId>.<DataSet>.test'
job = client.load_table_from_dataframe(sectional_accuracy, Ntable_id, job_config=job_config)
This function only reads data from one table and writes it to a new one. The idea is to do a load of transformations between the reading and writing.
The Function is associated to the App Engine default service account
for which I am the owner and I have added (probably overkill) The Cloud Run Invoker, Cloud Functions Invoker
and Cloud Scheduler Job Runner
.
Now, for the Cloud Scheduler:
I have defined it by HTTP
with POST
method with an URL, AUth OIDC token
with the same service account as that used by the function. As for the HTTP header
, I have User-Agent
with value Google-Cloud-Scheduler
. Note that I have no other header as I am uncertain of what it should be.
Yet, it fails every single time with a PERMISSION DENIED
message in the log.
What Have I tried:
- Change
geopos_test(request)
togeopos_test(event, context)
- Tried to change the
HTTP header
to (Content-Type, application/octet-stream
) or (Content-Type, application/json
) - Change service account
What I haven't tried is to give some value in body, since I do not know what it could be.
I am now out of ideas. Any help would be appreciated.
Update: Error message:
{
httpRequest: {1}
insertId: "********"
jsonPayload: {
@type: "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"
jobName: "******************"
status: "PERMISSION_DENIED"
targetType: "HTTP"
url: "*************"
}
logName: "*************/logs/cloudscheduler.googleapis.com%2Fexecutions"
receiveTimestamp: "2022-10-24T10:10:52.337822391Z"
resource: {2}
severity: "ERROR"
timestamp: "2022-10-24T10:10:52.337822391Z"