1

Note: These same steps work without any errors on Colab GPU.

Please help me with this. I created a dataset and saved it as file

data = tf.data.Dataset.from_tensor_slices(( features, labels))
tf.data.experimental.save(data, myfile)

When I try to load it

data = tf.data.experimental.load(myfile)

and run any function on the data like len(data), data.batch(16) or data.take(1) then I get this error:

NotFoundError: Could not find metadata file. [Op:DatasetCardinality]

TPU config

resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
tf.config.experimental_connect_to_cluster(resolver)
# This is the TPU initialization code that has to be at the beginning.
tf.tpu.experimental.initialize_tpu_system(resolver)

Is it similar to this TF1.14][TPU]Can not use custom TFrecord dataset on Colab using TPU ?

Nikhil
  • 1,126
  • 12
  • 26
  • I'm also facing the same problem, and I believe I setup the access from Colab to the GCS correctly, so I'm starting to think that it may be related to the way I saved my dataset. I did it exactly like you `tf.data.experimental.save(data, myfile)` – Alexandre Henrique Jun 20 '23 at 00:45
  • I found this [issue](https://github.com/tensorflow/tensorflow/issues/53292) on the TensorFlow repo from months ago. In the issue thread, it's discussed something about lazy loading. I tried the proposed implementation but it still raised the same error message. – Alexandre Henrique Jun 20 '23 at 01:35

1 Answers1

0

After some more debugging I got this error:

UnimplementedError: File system scheme '[local]' not implemented (file: './data/temp/2692738424590406024')
    Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors. [Op:DatasetCardinality]

I found this explanation:

Cloud TPUs can only access data in GCS as only the GCS file system is registered.

More info here: File system scheme '[local]' not implemented in Google Colab TPU

Nikhil
  • 1,126
  • 12
  • 26
  • I tried to save my dataset in a session *without* TPU onto GCS using experimental.save(...), and then read it back on a TPU session: train_set = tf.data.experimental.load('gs://ai-tests/NLP/data/train_set'), still got same error. I suspect the problem goes deeper than just file system scheme. – kawingkelvin Jun 16 '22 at 17:48