2

I've started to work with remote functions: https://cloud.google.com/bigquery/docs/reference/standard-sql/remote-functions

I've been able to setup a cloud function & call it with BigQuery. But no more than 60 instances of this cloud function are active at the same time, while the maximum is set to 3000.

This small number of instance seems not to be impacted by changing max_batching_rows nor the number of rows on which the function is called.

Configuration of the cloud function:

enter image description here

Graph showing the small number of instances active:

Variation over time are due to successive test with various load. enter image description here

Code of the cloud function:

A delay of 10s has been added for each call, it matches the time my processing will take.

import json
import time
import uuid


def add_fake_user(request):
    request_json = request.get_json(silent=True)
    replies = []
    calls = request_json['calls']
    call_id = str(uuid.uuid4())
    for call in calls:
        time.sleep(10)
        userno = call[0]
        corp = call[1]
        replies.append({
            'username': f'user_{userno}',
            'email': f'user_{userno}@{corp}.com',
            'n_call': len(calls),
            'call_id': call_id
        })
    return json.dumps({
        # each reply is a STRING (JSON not currently supported)
        'replies': [json.dumps(reply) for reply in replies]
    })

configuration of the remote function:

CREATE OR REPLACE FUNCTION `PROJECT_NAME`.trash.add_fake_user(user_id int64, corp_id STRING)  RETURNS STRING
REMOTE WITH CONNECTION `PROJECT_NAME.eu.gcf-conn` OPTIONS (endpoint = 'my_url', max_batching_rows=1)

Query calling the remote function

SELECT
  `PROJECT_NAME`.trash.add_fake_user(var1, var2) AS foo
FROM
  base

I've created an issue on Google's issue tracker: https://issuetracker.google.com/issues/235252503

RYegavian
  • 101
  • 6
  • The solution is in preview and should be limited in term of performance for observability reason. Just an assumption. – guillaume blaquiere May 18 '22 at 20:53
  • I second this. I think it just came to Public Preview for US at a moment. so it might still have some extra limits for other regions – Mikhail Berlyant May 18 '22 at 20:56
  • Thanks to both of you for the comment. I guess I'll regularly test the performance to see if scaling improves in the coming week. I'm wondering if my assumption is correct, do you think it is fair to assume that remote functions should be able to activate 3k instances in the future? – RYegavian May 24 '22 at 15:15

0 Answers0