0

List top level folders in GCP GCS from Cloud Function bucket API?

I have a GCS bucket that has objects like...

myfile.pdf
myimg.png
folder001/stuff/<some files or deep folders>
folder002/<some files or deep folders>
.
.
.
someOtherFolderName00n/<some files or deep folders>

... and just want to get the list of top level folders folder001, ..., someOtherFolderName00n.

I have a snippet of code in GCP's Cloud Functions using the Bucket API that looks like...

const admin = require('firebase-admin');
admin.initializeApp();
const sourceBucket = admin.storage().bucket("test_source_001");
exports.my_function = async (event, context) => {
    // get top level bucket folders
    const [sourceFiles] = await sourceBucket.getFiles({
        prefix: '',
        delimiter: '/'
    });

    // extract name property from each object
    const sourceFileNames = sourceFiles.map((file) => file.name);

    console.log(sourceFileNames)

... but this actually ends up listing everything in that bucket except for just top level directories (even the top level files that don't even have trailing '/'s), so I get a list like

myfile.pdf
myimg.png
folder001/stuff/
folder001/stuff/file1
...
folder001/stuff/fileN
folder002/file1
...
folder002/fileN
...
someOtherFolderName00n/file1
...
someOtherFolderName00n/fileN

I think I could just do something like...

s = new Set()
for (let f of sourceFileNames) {
    s.add(f.split('/')[0])
}

... but is there any way to just have the getFiles query return top level folders in the first place? (New to using GCP and Cloud Functions, so wonder if I'm just missing something simple here).

lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102
  • The folders are in `apiResponse.prefixes`. You will need to extend your code: `bucket.getFiles({autoPaginate: false, delimiter: '/'}, function(err, files, nextQuery, apiResponse) {}` – John Hanley Nov 29 '22 at 05:41
  • @JohnHanley Sorry, but I don't really understand this answer you gave, I've never used GCF before and there is nothing in the docs (https://cloud.google.com/nodejs/docs/reference/storage/latest/storage/getfilescallback) about how these callbacks work. – lampShadesDrifter Dec 09 '22 at 06:53
  • I did not post and answer, just a comment so that you know what to look into or search for in the documentation. – John Hanley Dec 09 '22 at 07:06

1 Answers1

1

There is an ability to specify a prefix of the required path in options.

Prefixes and delimiters can be used to emulate directory listings. Prefixes can be used to filter objects starting with a prefix. The delimiter argument can be used to restrict the results to only the objects in the given "directory". Without the delimiter, the entire tree under the prefix is returned.

If you want to list only folders try changing the prefix like this:

const [sourceFiles] = await sourceBucket.getFiles({
        prefix: 'folder',
        delimiter: '/'
    });

For more information refer to this document on how to specify prefixes

Sathi Aiswarya
  • 2,068
  • 2
  • 11
  • Note that the example in the question is just a simplified example. In reality, the top level folders have many different names w/out any unique pattern to them that distinguishes them from file (which are also just named generic names only for the purposes of illustrating the problem in the question). I've updated the question a bit to try to make this clearer. – lampShadesDrifter Dec 02 '22 at 20:37
  • check examples in this stackoverflow [thread1](https://stackoverflow.com/a/58903254/18265638) & [thread2](https://stackoverflow.com/questions/59526251/how-do-i-list-all-the-top-level-folders-in-given-gcs-bucket/69628207#69628207) – Sathi Aiswarya Dec 05 '22 at 09:07