3

I need to upload dataset of images in google colaboratory. It has subfolder inside it which contains images. Whatever I found on the net was for the single file.

from google.colab import files

uploaded = files.upload()

Is there any way to do it?

korakot
  • 37,818
  • 16
  • 123
  • 144
Karan Purohit
  • 459
  • 3
  • 8
  • 16
  • I believe this question and it's answer should help: https://stackoverflow.com/questions/47320052/load-local-data-files-to-colaboratory – Ben P May 25 '18 at 09:15
  • this all answers are talking about csv file. I have a folder of images.... – Karan Purohit May 25 '18 at 09:23

2 Answers2

2

For uploading data to Colab, you have three methods.

Method 1

You can directly upload file or directory in Colab UIenter image description here

The data is saved in Colab local machine. In my experiment, there are three features: 1) the upload speed is good. 2) it will remain directory structure but it will not unzip directly. You need to execute this code in Colab cell

!makedir {dir_name}
!unzip {zip_file} -d {dir_name}

3) Most importantly, when Colab crashes, the data will be deleted.

Method 2

Execute the code in Colab cell:

from google.colab import files
uploaded = files.upload()

In my experiment, when you run the cell, it appears the upload button. and when the cell executing indicator is still running, you choose a file. 1) After execution, the file name will appear in the result panel. 2)Refresh Colab files, you will see the file. 3) Or execute !ls, you shall see you file. If not, the file is not uploaded successfully.

Method 3

If your data is from kaggle, you can use Kaggle API to download data to Colab local directory.

Method 4

Upload data to Google Drive, you can use 1)Google Drive Web Browser or 2) Drive API (https://developers.google.com/drive/api/v3/quickstart/python). To access drive data, use the following code in Colab.

from google.colab import drive
drive.mount('/content/drive')

I would recommend uploading data to Google Drive because it is permanent.

Shark Deng
  • 960
  • 9
  • 26
0

You need to copy your dataset into Google Drive. Then obtain the DATA_FOLDER_ID. The best way to do so, is to open the folder in your Google Drive and copy the last part of html address. For example the folder id for the link:

https://drive.google.com/drive/folders/xxxxxxxxxxxxxxxxxxxxxxxx is xxxxxxxxxxxxxxxxxxxxxxxx

Then you can create local folders and upload each file recursively.

DATA_FOLDER_ID = 'xxxxxxxxxxxxxxxxxxxxxxxx'
ROOT_PATH = '~/you_path'
!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_root_path = os.path.expanduser(ROOT_PATH)
try:
  os.makedirs(local_root_path)
except: pass

def ListFolder(google_drive_id, destination):
  file_list = drive.ListFile({'q': "'%s' in parents and trashed=false" % google_drive_id}).GetList()
  counter = 0
  for f in file_list:
    # If it is a directory then, create the dicrectory and upload the file inside it
    if f['mimeType']=='application/vnd.google-apps.folder': 
      folder_path = os.path.join(destination, f['title'])
      os.makedirs(folder_path)
      print('creating directory {}'.format(folder_path))
      ListFolder(f['id'], folder_path)
    else:
      fname = os.path.join(destination, f['title'])
      f_ = drive.CreateFile({'id': f['id']})
      f_.GetContentFile(fname)
      counter += 1
  print('{} files were uploaded in {}'.format(counter, destination))

ListFolder(DATA_FOLDER_ID, local_root_path)
Ali
  • 41
  • 3
  • I ran the above script and it showing that files uploaded. But when I go to that folder its empty. – Karan Purohit May 26 '18 at 05:22
  • It works for me. How do you go to that folder? What is the output of `!ls you_path` – Ali May 29 '18 at 17:54
  • when I ran this line "from google.colab import auth" I get this error RuntimeError: 'path' must be None or a list, not . I tried pip install google and I keep getting the same error. – anitasp Jun 04 '18 at 20:46