FileNotFound when trying to pickle TensorFlow object in GPU

Question

I'm running the code below, and it works perfectly if TensorFlow is installed without GPU. But if installed with GPU, I get a FileNotFound error when I try to load the object.

I tried also with joblib and pickle directly, and I always get the same error.

Any help will be greatly appreciated.

import tensorflow as tf
import dill

def Generator():

    z_dim = 60

    FEATURES_LIST = ["aaa", "bbb", "ccc" ]
    ME_FEATURES_LIST = ["ddd", "eee", "fff" ]

    NUM_FEATURES = len(FEATURES_LIST)
    NUM_ME_FEATURES = len(ME_FEATURES_LIST)

    z = tf.keras.layers.Input(shape=(z_dim,), dtype='float32')
    y = tf.keras.layers.Input(shape=(NUM_ME_FEATURES,), dtype='float32')
    tr = tf.keras.layers.Input(shape=(1,), dtype='bool')
  
    x = tf.keras.layers.concatenate([z, y])
    x = tf.keras.layers.Dense(z_dim * NUM_ME_FEATURES, activation="relu")(x)
    out = tf.keras.layers.Dense(NUM_FEATURES, activation='sigmoid')(x)
    model = tf.keras.Model(inputs=[z, y, tr], outputs=(out, y))

    return model

G = Generator()

with open("dill_functional", 'wb') as file:
  dill.dump(G, file)

with open("dill_functional", 'rb') as file:
  G = dill.load(file)  # <--- error here

print(str(G))

C:\Users\igor-.cloned\gan> python .\dill_test.py 2023-02-09 22:42:28.379108: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-02-09 22:42:29.759547: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9426 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:01:00.0, compute capability: 8.6

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. model.compile_metrics will be empty until you train or evaluate the model.
Traceback (most recent call last):
 File "C:\Users\igor-\.cloned\gan\dill_test.py", line 32, in <module>
  G = dill.load(file)   
 File "C:\Users\igor-\anaconda3\envs\ai\lib\site-packages\dill\_dill.py", line 272, in load
  return Unpickler(file, ignore=ignore, **kwds).load()   
 File "C:\Users\igor-\anaconda3\envs\ai\lib\site-packages\dill\_dill.py", line 419, in load
  obj = StockUnpickler.load(self)   
 File "C:\Users\igor-\anaconda3\envs\ai\lib\site-packages\keras\saving\pickle_utils.py", line 47, in deserialize_model_from_bytecode
  model = save_module.load_model(temp_dir)   
 File "C:\Users\igor-\anaconda3\envs\ai\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
  raise e.with_traceback(filtered_tb) from None   
 File "C:\Users\igor-\anaconda3\envs\ai\lib\site-packages\tensorflow\python\saved_model\load.py", line 933, in load_partial
  raise FileNotFoundError(
FileNotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for  ram://fc47ea82-4f6b-4736-9394-980cc1f14358/variables/variables
You may be trying to load on a different device from the computational device. Consider setting the experimental_io_device option in tf.saved_model.LoadOptions to the io_device such as '/job:localhost'.

The traceback is messed up, could you post it with code formatting? — tripleee, Feb 10 '23 at 13:06
What's `ram://fc47ea82-4f6b-4736-9394-980cc1f14358/variables/variables`? — tripleee, Feb 10 '23 at 13:07
@tripleee I'm not sure what's `ram...` , I suspect it is the path to a memory address in my GPU's RAM where drill saved the file? I don't know why it's using RAM if I save the file to disk — ps0604, Feb 10 '23 at 16:57
Other people on stackoverflow had the same issue, looks like they resolved it using the Tensorflow keras.model to save into a hd5 format instead of using pickle: https://stackoverflow.com/questions/71676507/error-unsuccessful-tensorslicereader-constructor-failed-to-find-any-matching — Nolan Walker, Feb 15 '23 at 02:28

Paritosh Kulkarni · Accepted Answer · 2023-02-18T02:58:18.580

I cannot reproduce the error; I have a colabNB running here.

https://colab.research.google.com/drive/1lZRJTVjQCzTThU_9iYNSZVwgWzVWKzuM?usp=sharing

The error seems to be with how and where your operating system is saving the model file, and is unable to locate it with dill . I suggest either not using it or fixing PATH for the module. Alternatively D.Holmes's answer is working and is also present in the colab notebook.

If you are curious here is the output you should get (ideally) if your code is executed within a well-configured environment. (PATH issues fixed!)

The model is located here. /content/saved_model

Keras weights file (<HDF5 file "variables.h5" (mode r+)>) saving:
...layers
......concatenate
.........vars
......dense
.........vars
............0
............1
......dense_1
.........vars
............0
............1
......input_layer
.........vars
......input_layer_1
.........vars
......input_layer_2
.........vars
...vars
Keras model archive saving:
File Name                                             Modified             Size
variables.h5                                   2023-02-18 02:50:04        64712
config.json                                    2023-02-18 02:50:04         2083
metadata.json                                  2023-02-18 02:50:04           64
Keras model archive loading:
File Name                                             Modified             Size
variables.h5                                   2023-02-18 02:50:04        64712
config.json                                    2023-02-18 02:50:04         2083
metadata.json                                  2023-02-18 02:50:04           64
Keras weights file (<HDF5 file "variables.h5" (mode r)>) loading:
...layers
......concatenate
.........vars
......dense
.........vars
............0
............1
......dense_1
.........vars
............0
............1
......input_layer
.........vars
......input_layer_1
.........vars
......input_layer_2
.........vars
...vars
<keras.engine.functional.Functional object at 0x7f752cd6b820>

Good luck!

score 1 · Answer 2 · answered Feb 18 '23 at 04:08

1

You can store the model's weight by:

model_weight = model.get_weights()
// Store by pickle
pickle.dump(model_weight, open('model_weight.pkl', 'wb'))

// Load by pickle
weight = pickle.load(open('model_weight.pkl', 'rb'))
model.set_weights(weight)

answered Feb 18 '23 at 04:08

Đạt Huỳnh

51
1

score 0 · Answer 3 · answered Feb 19 '23 at 10:08

To fix this issue, you can try setting the experimental_io_device argument to the tf.saved_model.load() function to specify the device where the checkpoint files are located. For example, if the files are located on the CPU, you can try the below one

with tf.device('/cpu:0'):
G = tf.saved_model.load("path/to/saved/model", 
options=tf.saved_model.LoadOptions(experimental_io_device='/job:localhost'))

score -1 · Answer 4 · answered Feb 12 '23 at 22:45

-1

I don't know why but entering the exact location of this file fixes it. If you are using windows don't use " " or ";" in the path.

works:

with open("C:/model/dill_functional", 'rb') as file:
  G = dill.load(file)

not works:

with open("C:/users/some; path/model/dill_functional", 'rb') as file:
  G = dill.load(file)

answered Feb 12 '23 at 22:45

Rifat Alptekin Çetin

1,279
5
9

1

Sorry, I tried with the full path and I get the same error. Did you make it fail first with GPU? – ps0604 Feb 14 '23 at 12:22
yeah can you try to save model more simple path like "C:/model/dill_functional" – Rifat Alptekin Çetin Feb 14 '23 at 13:44
We tried that and many other ways and always get the same error – ps0604 Feb 14 '23 at 18:29

FileNotFound when trying to pickle TensorFlow object in GPU

4 Answers4