10

I am fairly new to using Google's Colab as my go-to tool for ML.

In my experiments, I have to use the 'notMNIST' dataset, and I have set the 'notMNIST' data as notMNIST.pickle in my Google Drive under a folder called as Data.

Having said this, I want to access this '.pickle' file in my Google Colab so that I can use this data.

Is there a way I can access it?

I have read the documentation and some questions on StackOverflow, but they speak about Uploading, Downloading files and/or dealing with 'Sheets'.

However, what I want is to load the notMNIST.pickle file in the environment and use it for further processing.

Any help will be appreciated.

Thanks !

Adhish Thite
  • 463
  • 2
  • 5
  • 20

4 Answers4

8

You can try the following:

import pickle
drive.mount('/content/drive')
DATA_PATH = "/content/drive/Data"
infile = open(DATA_PATH+'/notMNIST.pickle','rb')
best_model2 = pickle.load(infile)
Bert Carremans
  • 1,623
  • 4
  • 23
  • 47
2

The data in Google Drive resides in a cloud and in colaboratory Google provides a personal linux virtual machine on which your notebooks will run.so you need to download from google drive to your colaboratory virtual machine and use it. you can follow this download tutorial

Poorna Prudhvi
  • 711
  • 7
  • 22
2

Thanks, guys, for your answers. Google Colab has quickly grown into a more mature development environment, and my most favorite feature is the 'Files' tab.

We can easily upload the model to the folder we want and access it as if it were on a local machine.

This solves the issue.

Thanks.

Adhish Thite
  • 463
  • 2
  • 5
  • 20
1

You can use pydrive for that. First, you need to find the ID of your file.

# Install the PyDrive wrapper & import libraries.
# This only needs to be done once per notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz
listed = drive.ListFile({'q': "title contains '.pkl' and 'root' in parents"}).GetList()
for file in listed:
    print('title {}, id {}'.format(file['title'], file['id']))

You can then load the file using the following code:

from googleapiclient.discovery import build
drive_service = build('drive', 'v3')

import io
import pickle
from googleapiclient.http import MediaIoBaseDownload

file_id = 'laggVyWshwcyP6kEI-y_W3P8D26sz'

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
    # _ is a placeholder for a progress object that we ignore.
    # (Our file is small, so we skip reporting progress.)
    _, done = downloader.next_chunk()

downloaded.seek(0)
f = pickle.load(downloaded)
aliakbars
  • 51
  • 3