1

I am trying to move three month old files from one bucket to another bucket using python code. I tested the code using test function() in cloud function, it doesn't throw any error and status message shown as 'ok'.

In source bucket, my files located in gs://source_bucket_name/history/subfolder1/filename, gs://source_bucket-name/history/subfolder2/filename, etc., from here I want to move my files to destination bucket. I tried to achieve this using below python code but it's not working.

from google.cloud import storage
import zipfile
import datetime
import os

def move_files(request, context):
    source_bucket_name = 'source_bucket_name'
    destination_bucket_name = 'destination_bucket_name'
    client = storage.Client()

    source_bucket = client.get_bucket(source_bucket_name)
    destination_bucket = client.get_bucket(destination_bucket_name)

    three_month_file = datetime.datetime.now() - datetime.timedelta(days=90)
    
    prefixes = ['history/*/DATA_*', 'history/*/CTRL_*']

    for prefix in prefixes:
        for blob in source_bucket.list_blobs(prefix=prefix):
            blob_updated = blob.updated
            if blob_updated < three_month_file:
                #Create temporary local directory to store the zipped file
                temp_dir = 'temp'
                os.makedirs(temp_dir,exist_ok=True)

                #Download the file to the temporary local directory
                blob.download_to_filename(os.path.join(temp_dir,blob.name))
               

                #Zip the file
                zip_Name = os.path.join(temp_dir,f"{blob.name}.zip")
                with zipfile.ZipFile(zip_Name,'w') as zipfile_obj:
                    zipfile_obj.write(os.path.join(temp_dir,blob.name),blob.name)
                  

                #move the zipeed file to destination bucket
                destination_blob = destination_bucket.blob(f"{blob.name}.zip")
                destination_blob.upload_files(zip_Name)

                #Delete the original file and zipped file
                os.remove(os.path.join(temp_dir,blob.name))
                os.remove(zip_Name)

                #Delete the original from the source bucket
                blob.delete()
            #Remove the temporary local directory
    os.rmdir(temp_dir)
    print("successful")

I have no any idea why it's failing. Can anyone help me to resolve this issue?

Learner
  • 41
  • 6

1 Answers1

2

From @John Hanley's suggestion, cloud storage indeed supports bucket-to-bucket object copies, which are more efficient than the method of downloading and uploading files.

For example:

for prefix in prefixes:
    for blob in source_bucket.list_blobs(prefix=prefix):
        blob_updated = blob.updated
        if blob_updated < three_month_file:
            destination_blob = destination_bucket.blob(blob.name)

            # Copy the object to the destination bucket
            source_bucket.copy_blob(blob, destination_bucket, destination_blob.name)

            # Delete the original object from the source bucket
            blob.delete()

print("Successful")

With this function, it will perform a bucket-to-bucket object copy directly within Cloud Storage which can be more effecient and faster for moving objects between buckets.

For further information and references, you can explore the following links:

Python copy blob

How to move or copy Azure Blob from one container to another


Updated answer

Yes the issue might be related to your use of asterisk. You can try using the delimiter parameter. Or visit this link

Julia
  • 512
  • 6
  • My code not entering inside the for loop - `for blob in source_bucket.list_blobs(prefix=prefix): ` . Is that because of the `*` used in the prefixes? – Learner Aug 03 '23 at 08:14
  • Hello @Learner, I updated the answer. – Julia Aug 03 '23 at 16:15