7

I am using a simple (not necessarily efficient) method for Pytorch model saving.

import torch
from google.colab import files

torch.save(model, filename) # save a trained model on the VM
files.download(filename) # download the model to local

best_model = files.upload() # select the model just downloaded
best_model[filename] # access the model

Colab disconnects during execution of the last line, and hitting RECONNECT tab always shows ALLOCATING -> CONNECTING (fails, with "unable to connect to the runtime" message in the left bottom corner) -> RECONNECT. At the same time, executing any one of the cells gives Error message "Failed to execute cell, Could not send execute message to runtime: [object CloseEvent]"

I know it is related to the last line, because I can successfully connect with my other google accounts which doesn't execute that.

Why does it happen? It seems the google accounts which have executed the last line can no longer connect to the runtime.

Edit:

One night later, I can reconnect with the google account after session expiration. I just attempted the approach in the comment, and found that just files.upload() the Pytorch model would lead to the problem. Once the upload completes, Colab disconnects.

Ioannis Nasios
  • 8,292
  • 4
  • 33
  • 55
Francis
  • 6,416
  • 5
  • 24
  • 32
  • You can try using `files.upload` to write the file to disk and then load it using the appropriate method (pickle, torch.load, ...) ? Just: `files.upload(); pickle.load(....)` – phi Jun 04 '18 at 13:32
  • Thanks, I think it is a good alternative if the problem is indeed the last line. However, I just found `files.upload()` the Pytorch model can disconnect the notebook. Please see my edit. – Francis Jun 05 '18 at 01:26

3 Answers3

10

Try disabling your ad-blocker. Worked for me

Alex
  • 101
  • 1
  • 3
  • and make sure chrome isn't asking you to re-authenticate. or just make sure you're authenticated in whatever browser you're using – gary69 Oct 07 '19 at 16:31
1

(I wrote this answer before reading your update. Think it may help.)

files.upload() is just for uploading files. We have no reason to expect it to return some pytorch type/model.

When you call a = files.upload(), a is a dictionary of filename - a big bytes array.

{'my_image.png': b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR....' }
type(a['my_image.png'])

Just like when you do open('my_image', 'b').read()

So, I think the next line best_model[filename] try to print the whole huge bytes array, which bugs the colab.

phi
  • 10,572
  • 3
  • 21
  • 30
  • Thanks, I guess you are right with printing huge bytes array being the cause to the problem. I have to do `best_model = files.upload()` instead of just `files.upload()` to avoid printing, and replace `best_model[filename]` with `best_model = torch.load(filename)`. Now I can import the saved model :) – Francis Jun 06 '18 at 01:33
0

I also encountered the "Unable to connect to the runtime" issue in Google Colab on my Ubuntu machine. This was preventing me from connecting to the Colab runtime and accessing my notebook.

After investigating the problem, I found that the root cause was a lack of free space on my Ubuntu system. When I checked the available disk space, it showed that I had 0 bytes of free space left.

imok1948
  • 81
  • 1
  • 3