2

I want to count number of files from gs bucket which has folder named as myfilesKeeper, Assume Project Name is PrName and bucket name is TestFiles So How can I read number of files present in that bucket using python 3

2 Answers2

4

You can use method .list_blobs() on your storage client to count the number of files in a bucket:

from google.cloud import storage

storage_client = storage.Client()
blobs_list = storage_client.list_blobs(bucket_or_name='name_of_your_bucket')

print(sum(1 for _ in blobs_list))

.list_blobs() returns an iterator, so this answer basically loops over the iterator and counts the elements.

If you only want to count the files within a certain folder in your bucket, you can use the prefix keyword:

blobs_list = storage_client.list_blobs(
    bucket_or_name='name_of_your_bucket',
    prefix='name_of_your_folder',
)
Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
2

There is no API that provide you the total of file in Cloud Storage. You must count one per one each file!

But there is something smarter to speed up the count, you can count them per block. In the Cloud Storage Object List API, you can set a MaxResult per page. Set it to 1000 (the max).

When you receive the API result, test if there is a nextPageToken attribute.

  • If so, add 1000 to your counter and immediately ask the next page by a new API call with the nextPageToken in it
  • If not, you are in the latest page. Count the items element. In Python, a len(result.items) works well. And add it to the counter.
guillaume blaquiere
  • 66,369
  • 2
  • 47
  • 76