1

I'm in an introductory neural networking class so mind my ignorance.

I have a folder containing roughly 12,000 texture images, each divided into ten different subsets. An example file name would be bubbly_0012.png. Some file names begin with the same first letter, for example, bubbly_0012.png and blotchy_0012.png.

I'm trying to create a .csv file containing arrays of each of the images. I want to label each image according to their subset (and therefore their name), so that bubbly is given the label 0, blotched is given the label 1, as so on.

I found that I'm able to do this with the first letter of each file name using this line:

if (file[0]) == "b":
     name_array = [[0]]

However, this becomes an issue when I try and label subsets that begin with the same letter For the blotchy subset, I tried the following:

if (file[0:1]) == "bl":
     name_array = [[0]]

But this didn't work.

Any advice would be greatly appreciated.

qiki
  • 41
  • 3
  • if you know all labels in advance, using a dictionary of string label to class would be the trivial solution – Marat Oct 11 '19 at 01:39
  • You want to group image file names? Do you want to group by the first letter of the file name or by the first word of the file? (i.e. bubbly_0001, bubbly_0002 and bubbly_0003 grouped together and blotchy_0001, blotchy_0002, blotchy_0003 grouped together) – Iain Shelvington Oct 11 '19 at 01:42
  • For more information on slicing strings/lists/etc, see [this answer](https://stackoverflow.com/a/509295/7675174). – import random Oct 11 '19 at 05:54

1 Answers1

0

Well in slices/ranges etc... the first index is inclusive while the second is not. So a slice of [0:1] will return the first index. to slice including the second digit we need to use [0:2] which will return the 0th and 1th index. However if we already know the keywords it is best to use some other means.

my_dict_of_files_and_indexes = {"blotchy":1, "bubly":0} 
#if we manually fill out the fields then we can just use a list and index it

my_list_of_file_names = ["bubly", "blotchy"]

#for our list where its already in the correct order
if file in my_list_of_file_names:
    index_of_file = my_list_of_file_names.index(file)
    #do stuff with the file and index

#for our dict where the order of indexes don't matter
if my_dict_of_files_and_indexes.contains(file):
    index = my_dict_of_files_and_indexes[file]
    #do stuff with the file and index
Marat
  • 15,215
  • 2
  • 39
  • 48
TheLazyScripter
  • 2,541
  • 1
  • 10
  • 19