0

In my cloud function, I need to get a file in cloud storage and send the file to an API through HTTP POST request. I tried the following code:

storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)
source_blob_name = "/compressed_data/file_to_send.7z"
blob = bucket.blob(source_blob_name)

url = UPLOADER_BACKEND_URL
files = {'upload_file': blob}
values = {'id': '1', 'ouid': OUID}
r = requests.post(url, files=files, data=values)

It gave an error saying:

Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker_v2.py", ... 
 \ line 90, in encode_multipart_formdata body.write(data) TypeError: a bytes-like object is required, not 'Blob'

If this code was to run on an actual VM, the following would work:

url = UPLOADER_BACKEND_URL
files = {'upload_file': open('/tmp/file_to_send.7z','rb')}
values = {'id': '1', 'name': 'John'}
r = requests.post(url, files=files, data=values)

So the question is: In cloud functions, how can I load a file from cloud storage such that it has the same output as the python open(filename, 'rb') function?

I know that I can do blob.download_to_file() and then open() the file, but I'm wondering if there is a quicker way.

Puteri
  • 3,348
  • 4
  • 12
  • 27
Kid_Learning_C
  • 2,605
  • 4
  • 39
  • 71

2 Answers2

2

In your Cloud Functions reference, you don't provide the Blob content to the API call but only the Blob reference (file path + Bucket name).

You can, indeed download the file locally in the in memory file system /tmp directory. and then handle this tmp file as any file. Don't forget to delete it after the upload!!

You can also have a try to the gcsfs library where you can handle files in a python idiomatic way. I never tried to do this when I call an API, but it should work.

guillaume blaquiere
  • 66,369
  • 2
  • 47
  • 76
  • Thank you for your answer. I'm a bit confused when you say "don't forget to delete it after the upload". I thought that when the cloud function finishes executing, the local files will be deleted automatically. Why is it necessary to delete the files in `/tmp/`? – Kid_Learning_C Oct 20 '20 at 10:50
  • 1
    The `/tmp` is an in memory file system. The content disappear when the instance is unload. When a function run, it runs on an instance. At the end of the function, the instance is kept running in case of a new request comes. If not, after a while the instance is unloaded. But if you have always requests, the same instance is used and cleaning /tmp directory is required to prevent the out of memory crash. This also means that if you have global variables (like db connexion) you can reuse them between request, until the instance is unloaded. – guillaume blaquiere Oct 20 '20 at 12:25
0

I tried using

with blob.open("rb") as f:

More information:

How to transfer data from Google Cloud Storage to SFTP using Python without writing to the file system

Titu
  • 176
  • 7