0

Based on a list of filenames, which correspond to the labels of the images I want to be able to predict, I would like to automatically create these folders so that I can move the correct files to the correct folder later.

To facilitate this, especially due to the huge number of possible folders, I wanted to create those buckets based on the identifier.

So for instance I have following list label 4354634 354545 4335435 112121 4865633 ....

Goal would be to run through the list & create a folder in a bucket

gs://some-bucket-name/4354634/

gs://some-bucket-name/354545/

gs://some-bucket-name/4335435/

gs://some-bucket-name/112121/

gs://some-bucket-name/4865633/

gs://some-bucket-name/.../

I tried following code but it only gave me certain output in the notebook, not by creating folders

def sku_to_bucket(label_id):
    bucket = client.get_bucket('some-bucket')
    d = str(label_id) + '/'
    d = bucket.blob(d)

import pandas as pd
loop_sub = pd.read_csv("loopfile.csv")

for label_id in loop_sub.iterrows() :
    sku_to_bucket(label_id)
    print(str(label_id))

output below


(0, label_id 63453654635, Name: 0, dtype = int64)

Expected results are that I get a folderstructure based on the label_id's in a Google Cloud Storage bucket.

smikkels
  • 57
  • 10

1 Answers1

2

There is no notion of a "directory": a blob/object is always a file. A workaround would be to add a dummy file onto a folder and upload this dummy file. Please check the following related question link.

This is the code that I used:

import pandas as pd

loop_sub = pd.read_csv("loopfile.csv", names = ['val'])
bucket = storage_client.get_bucket('some-bucket-name')
file = 'temp'

for label_id in loop_sub['val']:
    blob = bucket.blob(str(label_id) + '/')
    blob.upload_from_string(file)
marian.vladoi
  • 7,663
  • 1
  • 15
  • 29