14

I have a dataset of images on my Google Drive. I have this dataset both in a compressed .zip version and an uncompressed folder.

I want to train a CNN using Google Colab. How can I tell Colab where the images in my Google Drive are?

  1. official tutorial does not help me as it only shows how to upload single files, not a folder with 10000 images as in my case.

  2. Then I found this answer, but the solution is not finished, or at least I did not understand how to go on from unzipping. Unfortunately I am unable to comment this answer as I don't have enough "stackoverflow points"

  3. I also found this thread, but here all the answer use other tools, such as Github or dropbox

I hope someone could explain me what I need to do or tell me where to find help.

Edit1:

I have found yet another thread asking the same question as mine: Sadly, of the 3 answers, two refer to Kaggle, which I don't know and don't use. The third answer provides two links. The first link refers to the 3rd thread I linked, and the second link only explains how to upload single files manually.

charelf
  • 3,103
  • 4
  • 29
  • 51

4 Answers4

11

To update the answer. You can right now do it from Google Colab

# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization.
drive.mount('/content/drive')

!ls "/content/drive/My Drive"

Google Documentation

7

As mentioned by @yl_low here

Step 1:

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

Step 2:

from google.colab import auth
auth.authenticate_user()

Step 3:

from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Both Step 2 and 3 will require to fill in the verification code provided by the URLs

Step 4:

!mkdir -p drive
!google-drive-ocamlfuse drive

Step 5:

print('Files in Drive:')
!ls drive/
VeilEclipse
  • 2,766
  • 9
  • 35
  • 53
  • Thank you for answer. It was useful for me in the beginning, but I unaccepted it to accept a more up to date version, which is more useful for people having this problem. – charelf Feb 03 '19 at 08:35
5

Other answers are excellent, but they require everytime to authenticate in Google Drive, that is not very comfortable if you want to run top down your notebook.

I had the same need, I wanted to download a single zip file containing dataset from Drive to Colab. I preferred to get shareable link of that file and run following cell (substitute drive_url with your shared link):

import urllib

drive_url = 'https://drive.google.com/uc?export=download&id=1fBVMX66SlvrYa0oIau1lxt1_Vy-XYZWG'
file_name = 'downloaded.zip'

urllib.request.urlretrieve(drive_url, file_name)
print('Download completed!')
RomRoc
  • 1,465
  • 2
  • 12
  • 12
2

I saw and tried all the above but it didn't work for me. So here is a simple solution with simple explanation that can help you load a .zip image folder and extract images from it.

  • Connect to google drive
    from google.colab import drive
    drive.mount('/content/drive')
    

(you will get a link sign in to your google account and copy the code and paste onto the code asked in the colab)

  • Install and import keras library
    !pip install -q keras
    import keras
    
    

(the zip file is loaded into the colab)

  • Unzip the folder
    ! unzip 'zip-file-path'
    

To get the path:

  • select file on left side of google colab
  • browse for the file click on the 3 dots
  • copy path

Now the unzipped image folder is loaded onto your colab use it as you wish

Neha
  • 21
  • 1