1

I am using Google Colab and I would like to use my custom libraries / scripts, that I have stored on my local machine. My current approach is the following:

# (Question 1)
from google.colab import drive
drive.mount("/content/gdrive")

# Annoying chain of granting access to Google Colab
# and entering the OAuth token.

And then I use:

# (Question 2)
!cp /content/gdrive/My\ Drive/awesome-project/*.py .

Question 1: Is there a way to avoid the mounting of the drive entriely? Whenever the execution context changes (e.g. when I select "Hardware Acceleration = GPU", or when I wait an hour), I have to re-generate and re-enter the OAuth token.

Question 2: Is there a way to sync files between my local machine and my Google Colab scripts more elegently?

Partial (not very satisfying answer) regarding Question 1: I saw that one could install and use Dropbox. Then you can hardcode the API Key into the application and mounting is done, regardless of whether or not it is a new execution context. I wonder if a similar approach exists based on Google Drive as well.

r0f1
  • 2,717
  • 3
  • 26
  • 39

2 Answers2

1

If your code isn't secret, you can use git to sync your local codes to github. Then, git clone to Colab with no need for any authentication.

korakot
  • 37,818
  • 16
  • 123
  • 144
1

Question 1. Great question and yes there is- I have been using this workaround which is particularly useful if you are a researcher and want other to be able to re run your code- or just 'colab'orate when working with larger datasets. The below method has worked well working as a team and there are challenges to each person having their own version of datasets.

I have used this regularly on 30 + Gb of image files downloaded and unzipped to colab run time.

The file id is in the link provided when you share from google drive

you can also select multiple files and select share all and then get a generate for example a .txt or .json file which you can parse and extract the file id's.

from google_drive_downloader import GoogleDriveDownloader as gdd 

#some file id/ list of file ids parsed from file urls.

google_fid_id = '1-4PbytN2awBviPS4Brrb4puhzFb555g2'
destination = 'dir/dir/fid'
#if zip file ad kwarg unzip=true

gdd.download_file_from_google_drive(file_id=google_fid_id,
destination, unzip=True)

A url parsing function to get file ids from a list of urls might look like this:

def parse_urls():
    with open('/dir/dir/files_urls.txt', 'r') as fb:
        txt = fb.readlines()
    return [url.split('/')[-2] for url in txt[0].split(',')]

One health warning is that you can only repeat this a small number of times in a 24 hour window for the same files.

Here's the gdd git repo:

https://github.com/ndrplz/google-drive-downloader

here is an working example (my own) of how it works inside bigger script:

https://github.com/fdsig/image_utils

Question 2.

You can connect to a local run time but this also means using local resources gpu/cpu etc.

Really hope this helps :-).

F~

fdsig
  • 268
  • 2
  • 5