Is there a way to automatically create folders in GCS buckets based on list of filenames?

Question

Based on a list of filenames, which correspond to the labels of the images I want to be able to predict, I would like to automatically create these folders so that I can move the correct files to the correct folder later.

To facilitate this, especially due to the huge number of possible folders, I wanted to create those buckets based on the identifier.

So for instance I have following list label 4354634 354545 4335435 112121 4865633 ....

Goal would be to run through the list & create a folder in a bucket

gs://some-bucket-name/4354634/

gs://some-bucket-name/354545/

gs://some-bucket-name/4335435/

gs://some-bucket-name/112121/

gs://some-bucket-name/4865633/

gs://some-bucket-name/.../

I tried following code but it only gave me certain output in the notebook, not by creating folders

def sku_to_bucket(label_id):
    bucket = client.get_bucket('some-bucket')
    d = str(label_id) + '/'
    d = bucket.blob(d)

import pandas as pd
loop_sub = pd.read_csv("loopfile.csv")

for label_id in loop_sub.iterrows() :
    sku_to_bucket(label_id)
    print(str(label_id))

output below

(0, label_id 63453654635, Name: 0, dtype = int64)

Expected results are that I get a folderstructure based on the label_id's in a Google Cloud Storage bucket.

score 2 · Accepted Answer · answered Oct 10 '19 at 10:58

There is no notion of a "directory": a blob/object is always a file. A workaround would be to add a dummy file onto a folder and upload this dummy file. Please check the following related question link.

This is the code that I used:

import pandas as pd

loop_sub = pd.read_csv("loopfile.csv", names = ['val'])
bucket = storage_client.get_bucket('some-bucket-name')
file = 'temp'

for label_id in loop_sub['val']:
    blob = bucket.blob(str(label_id) + '/')
    blob.upload_from_string(file)

Is there a way to automatically create folders in GCS buckets based on list of filenames?

1 Answers1