2

I wrote a NN model that analyze an image and extract 8 floating numbers at the end. The model is working fine (but slowly) on my computer so I try it on the TPU cloud and there BAM! I have an error:

I1008 12:58:47.077905 140221679261440 tf_logging.py:115] Error recorded from training_loop: File system scheme '[local]' not implemented (file: '/home/gcloud_iba/Data/CGTR/model/GA_subset/model.ckpt-0_temp_e840841d93124a67b54074b1c0fd7ae4') [[{{node save/SaveV2}} = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:worker/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, batch_normalization/beta/Read/ReadVariableOp, batch_normalization/beta/Momentum/Read_1/ReadVariableOp, batch_normalization/gamma/Read/ReadVariableOp, batch_normalization/gamma/Momentum/Read_1/ReadVariableOp, batch_normalization/moving_mean/Read/ReadVariableOp, batch_normalization/moving_variance/Read/ReadVariableOp, batch_normalization_1/beta/Read/ReadVariableOp, batch_normalization_1/beta/Momentum/Read_1/ReadVariableOp, batch_normalization_1/gamma/Read/ReadVariableOp, batch_normalization_1/gamma/Momentum/Read_1/ReadVariableOp, batch_normalization_1/moving_mean/Read/ReadVariableOp, batch_normalization_1/moving_variance/Read/ReadVariableOp, conv2d/kernel/Read/ReadVariableOp, conv2d/kernel/Momentum/Read_1/ReadVariableOp, conv2d_1/kernel/Read/ReadVariableOp, conv2d_1/kernel/Momentum/Read_1/ReadVariableOp, conv2d_2/kernel/Read/ReadVariableOp, conv2d_2/kernel/Momentum/Read_1/ReadVariableOp, conv2d_3/kernel/Read/ReadVariableOp, conv2d_3/kernel/Momentum/Read_1/ReadVariableOp, conv2d_4/kernel/Read/ReadVariableOp, conv2d_4/kernel/Momentum/Read_1/ReadVariableOp, conv2d_5/kernel/Read/ReadVariableOp, conv2d_5/kernel/Momentum/Read_1/ReadVariableOp, conv2d_6/kernel/Read/ReadVariableOp, conv2d_6/kernel/Momentum/Read_1/ReadVariableOp, conv2d_7/kernel/Read/ReadVariableOp, conv2d_7/kernel/Momentum/Read_1/ReadVariableOp, conv2d_8/kernel/Read/ReadVariableOp, conv2d_8/kernel/Momentum/Read_1/ReadVariableOp, conv2d_9/kernel/Read/ReadVariableOp, conv2d_9/kernel/Momentum/Read_1/ReadVariableOp, dense/bias/Read/ReadVariableOp, dense/bias/Momentum/Read_1/ReadVariableOp, dense/kernel/Read/ReadVariableOp, dense/kernel/Momentum/Read_1/ReadVariableOp, dense_1/bias/Read/ReadVariableOp, dense_1/bias/Momentum/Read_1/ReadVariableOp, dense_1/kernel/Read/ReadVariableOp, dense_1/kernel/Momentum/Read_1/ReadVariableOp, dense_2/bias/Read/ReadVariableOp, dense_2/bias/Momentum/Read_1/ReadVariableOp, dense_2/kernel/Read/ReadVariableOp, dense_2/kernel/Momentum/Read_1/ReadVariableOp, dense_3/bias/Read/ReadVariableOp, dense_3/bias/Momentum/Read_1/ReadVariableOp, dense_3/kernel/Read/ReadVariableOp, dense_3/kernel/Momentum/Read_1/ReadVariableOp, global_step/Read/ReadVariableOp)]]

I checked that the TPU has access to the hard drive and it works (I have another piece of code that reads the same dataset with another model). I do not cache my data (yet) but I do some prefetching. Aside this, I don't see what isn't working?

Thank you for any help you could provide!

Pi-r

michaelb
  • 252
  • 1
  • 6

2 Answers2

3

The local filesystem is not available on Cloud TPU's. Model directories (checkpoints etc) and input data should be stored in Google Cloud Storage (and prefixed with "gs://").

More details here

https://cloud.google.com/tpu/docs/storage-buckets

michaelb
  • 252
  • 1
  • 6
1

In absence of Google Cloud Storage, write your model using Keras API (https://keras.io/).

aman2930
  • 275
  • 2
  • 9