I am using jupyter notebook in google collab. My training dataset looks like this:
/data/label1/img1.jpeg
.
.
.
/data/label2/img90.jpeg
I want to import such dataset. Things that I tried
Step1:
!pip install -U -q PyDrive
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
from os import walk
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
Step 2:
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
Step 3
file_to_download = os.path.expanduser('./data/')
file_list = drive.ListFile(
{'q': 'id_of_the_data_directory'})
Not sure how to proceed next. The folder data
is my collab notebook folder in the drive. I want to read the images along with labels.In order to do the same I am using the code:
filename_queue=tf.train.string_input_producer(tf.train.match_filenames_once('data/*/*.jpeg'))
image_reader=tf.WholeFileReader()
key,image_file=image_reader.read(filename_queue)
#key is the entire path to the jpeg file and we need only the subfolder as the label
S = tf.string_split([key],'\/')
length = tf.cast(S.dense_shape[1],tf.int32)
label = S.values[length-tf.constant(2,dtype=tf.int32)]
label = tf.string_to_number(label,out_type=tf.int32)
#decode the image
image=tf.image.decode_jpeg(image_file)
#then code to place labels and folders in corresponding arrays