0

I am facing below error while uploading file to a GCP bucket. Few records are exceeding (10000000 bytes) allowed by GCP. I have a working Python script to upload file in GCP which uses upload blob functionality. Need to enhance this script to identify & capture records exceeding this limit.

Error: google.cloud.pubsub_v1.publisher.exceptions.MessageTooLargeError: The message being published would produce too large a publish request that would exceed the maximum allowed size on the backend (10000000 bytes).

CRP
  • 9
  • 2

1 Answers1

1

Are these normal files that you are converting to blobs and uploading? If so you could perform a check on the size of the file using these techniques so something like

if os.path.getsize(filepath) < 10000000:
    upload(filepath)
else:
    continue

Edit:

Finding the size of indiviual records would be harder. The second result indicates you can find the memory size of objects, but you may want to pad your limit to avoid the error.

Alternatively you could save the record to a temporary file and then get the size of that file for a slower but more bullet proof method of finding the record size

HagenSR
  • 11
  • 3
  • Thanks, but this is based on file size. I want to capture the records inside the files which are exceeding that size. Should be able to upload the file with other good records. – CRP Aug 31 '21 at 20:19