2

I need to get the number of files in a bucket of GCS. I don't want to use list_blobs to read them one by one and increase a counter. Is there something like a metadata we can query?

I need to download all the files in the bucket and process them. now I want to do it using threads so I would need to separate files to groups somehow. The idea was to use list_blobs with offset and size, but in order to do that I need to know the number of total files.

Any idea?

Thanks

Ido Barash
  • 4,856
  • 11
  • 41
  • 78

3 Answers3

1

There's no way to do a single metadata query to get the count. You could run a command like:

gsutil ls gs://my-bucket/** | wc -l

but note that this command is making a number of bucket listing requests behind the scenes - which can take a long time if the bucket is large, and will cost based on the number of operations it makes.

Mike Schwartz
  • 11,511
  • 1
  • 33
  • 36
1

I know the original question did not want to use .list_blobs() to count the number of files in a bucket, but since I didn't find a different way, I'm posting it here for reference, since it does work:

from google.cloud import storage

storage_client = storage.Client()
blobs_list = storage_client.list_blobs(bucket_or_name='name_of_your_bucket')

print(sum(1 for _ in blobs_list))

.list_blobs() returns an iterator, so this answer basically loops over the iterator and counts the elements.

If you only want to count the files within a certain folder in your bucket, you can use the prefix keyword:

blobs_list = storage_client.list_blobs(
    bucket_or_name='name_of_your_bucket',
    prefix='name_of_your_folder',
)

FYI: this question suggests a different method to solve this:
How can I get number of files from gs bucket using python

Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
0

For those looking for a command line answer, you can use

gsutil du gs://pub | wc -l

Answering here as this was the first link I got when I searched. References: https://stackoverflow.com/a/18986955/6733421 https://cloud.google.com/storage/docs/gsutil/commands/du

Sai Chander
  • 829
  • 1
  • 6
  • 15