2

The "sharing url" obtained for Google Colab links to a file with lots of extra HTML, so when you try to download it using wget or curl, you don't get a valid Jupyter .ipynb file.

How does one go about obtaining the file itself using command-line tools such as wget or curl? (Note: I'm talking about "public" sharing URLs, where "anyone with the URL can view". And I'd rather not have to use specialized google-drive command-line API tools that may require authentication, etc.)

GitHub has a "raw" button you can press that'll give you a valid URL for downloading Jupyter notebook files, but I don't see any such thing in Colab. Maybe there's some kind of "?form=raw" thing one can add to the URL?

Alternatively, is a there a recommended script for stripping out all the extra HTML and just leaving the JSON for the .ipynb file?

Clarification: I'm not talking about manually moving the mouse inside Colab and clicking "File > Download > Download ipynb"; that's easy! I'm talking about programatically getting the file using the "sharing URL".

sh37211
  • 1,411
  • 1
  • 17
  • 39
  • Workaround: This does not directly answer what I asked, but may be good enough: Colab has a "Save to GitHub" feature. By saving to GitHub, and then using the "raw" URL on GitHub,... this may suffice. – sh37211 Mar 23 '21 at 20:13

1 Answers1

3

Solved, via this post: You just need to extract the "file id" from the url!

If the sharing url is https://colab.research.google.com/drive/1SxJJc6LsKrjWAM-HhwPrLJBpUzImO5oX?usp=sharing

then the fileid is everything between the "drive/" and the question mark.

A little bash example...

$ export fileid=1SxJJc6LsKrjWAM-HhwPrLJBpUzImO5oX
$ wget -O downloaded_file.ipynb 'https://docs.google.com/uc?export=download&id='$fileid

...and the result is a valid Jupyter file at downloaded_file.ipynb. :-)

Putting all this in a handy bash function can then look like this:

grabcolab() { fileid=$( echo "$1" | sed -E 's/.*drive\/(.*)\?.*/\1/' ); wget -O colab.ipynb 'https://docs.google.com/uc?export=download&id='$fileid; }

Then we just run grabcolab <sharing url> as in:

grabcolab https://colab.research.google.com/drive/1SxJJc6LsKrjWAM-HhwPrLJBpUzImO5oX?usp=sharing

PS- Off-topic but if you want to run the notebook from the command line, jupytext (installable via pip install jupytext) is working for me a bit better than jupyter notebook --to-script, so what I'm using to run the notebook is

nbrun() { jupyter nbconvert --to script "$1";  mv  "${1%.*}".py run_this.ipy; ipython run_this.ipy;}

As in

$ grabcolab https://colab.research.google.com/drive/1SxJJc6LsKrjWAM-HhwPrLJBpUzImO5oX?usp=sharing
$ nbrun colab.ipynb
sh37211
  • 1,411
  • 1
  • 17
  • 39