19

I am trying to delete a folder in GCS and its all content (including sub-directories) with its Python library. Also I understand GCS doesn't really have folders (but prefix?) but I am wondering how I can do that?

I tested this code:

from google.cloud import storage

def delete_blob(bucket_name, blob_name):
    """Deletes a blob from the bucket."""
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(blob_name)

    blob.delete()

delete_blob('mybucket', 'top_folder/sub_folder/test.txt')
delete_blob('mybucket', 'top_folder/sub_folder/')

The first call to delete_blob worked but not the 2nd one. What can I delete a folder recursively?

kee
  • 10,969
  • 24
  • 107
  • 168

3 Answers3

32

To delete everything starting with a certain prefix (for example, a directory name), you can iterate over a list:

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs(prefix='some/directory')
for blob in blobs:
  blob.delete()

Note that for very large buckets with millions or billions of objects, this may not be a very fast process. For that, you'll want to do something more complex, such as deleting in multiple threads or using lifecycle configuration rules to arrange for the objects to be deleted.

jterrace
  • 64,866
  • 22
  • 157
  • 202
Brandon Yarbrough
  • 37,021
  • 23
  • 116
  • 145
  • 1
    My initial hunch was to list out all and then iterate and delete the ones matching the given path.. but glad to know this options exists in the API (java-storage API in my case).. Thank you :) – Sudip Bhandari Apr 08 '20 at 17:24
  • You could also use `bucket.delete_blobs` method to delete all list files with one network round trip O(1) instead of O(n) separate `delete` calls. – user482594 Jun 23 '21 at 21:37
  • 6
    `bucket.delete_blobs` deletes one by one. https://googleapis.dev/python/storage/latest/buckets.html#google.cloud.storage.bucket.Bucket.delete_blobs @user482594 – Jey Aug 09 '21 at 08:27
4

Now it can be done by:

def delete_folder(cls, bucket_name, folder_name):
    bucket = cls.storage_client.get_bucket(bucket_name)
    """Delete object under folder"""
    blobs = list(bucket.list_blobs(prefix=folder_name))
    bucket.delete_blobs(blobs)
    print(f"Folder {folder_name} deleted.")
BryanF
  • 111
  • 1
  • 5
1

deleteFiles might be what you are looking for. Or in Python delete_blobs. Assuming they are the same, the Node docs do a better job describing the behavior, namely

This is not an atomic request. A delete attempt will be made for each file individually. Any one can fail, in which case only a portion of the files you intended to be deleted would have.

Operations are performed in parallel, up to 10 at once.