136

Having read the docs, I saved a model in TensorFlow, here is my demo code:

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
  sess.run(init_op)
  # Do some work with the model.
  ..
  # Save the variables to disk.
  save_path = saver.save(sess, "/tmp/model.ckpt")
  print("Model saved in file: %s" % save_path)

but after that, I found there are 3 files

model.ckpt.data-00000-of-00001
model.ckpt.index
model.ckpt.meta

And I can't restore the model by restore the model.ckpt file, since there is no such file. Here is my code

with tf.Session() as sess:
  # Restore variables from disk.
  saver.restore(sess, "/tmp/model.ckpt")

So, why there are 3 files?

GoingMyWay
  • 16,802
  • 32
  • 96
  • 149

4 Answers4

126

Try this:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('/tmp/model.ckpt.meta')
    saver.restore(sess, "/tmp/model.ckpt")

The TensorFlow save method saves three kinds of files because it stores the graph structure separately from the variable values. The .meta file describes the saved graph structure, so you need to import it before restoring the checkpoint (otherwise it doesn't know what variables the saved checkpoint values correspond to).

Alternatively, you could do this:

# Recreate the EXACT SAME variables
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")

...

# Now load the checkpoint variable values
with tf.Session() as sess:
    saver = tf.train.Saver()
    saver.restore(sess, "/tmp/model.ckpt")

Even though there is no file named model.ckpt, you still refer to the saved checkpoint by that name when restoring it. From the saver.py source code:

Users only need to interact with the user-specified prefix... instead of any physical pathname.

Uvuvwevwevwe
  • 971
  • 14
  • 30
T.K. Bartel
  • 1,365
  • 1
  • 11
  • 8
  • 1
    so the .index and the .data are not used? When are those 2 files used, then? – ajfbiw.s May 03 '17 at 21:27
  • 31
    @ajfbiw.s .meta stores the graph structure, .data stores the values of each variable in the graph, .index identifies the checkpiont. So in the example above: import_meta_graph uses the .meta, and saver.restore uses the .data and .index – T.K. Bartel May 04 '17 at 20:06
  • Oh, I see. Thanks. – ajfbiw.s May 04 '17 at 22:19
  • Any idea why saver = tf.train.import_meta_graph('./name.meta') is giving me an error -- "KeyError: u'VariableV2' "? – ajfbiw.s May 04 '17 at 22:19
  • 1
    Any chance you saved the model with a different version of TensorFlow than you're using to load it? (https://github.com/tensorflow/tensorflow/issues/5639) – T.K. Bartel May 05 '17 at 03:10
  • That was exactly why! Thank you! – ajfbiw.s May 05 '17 at 04:07
  • Thanks for this answer - really helpful (I ran into the same KeyError) but I then got this error: 'ValueError: GraphDef cannot be larger than 2GB.' Do you have any idea how to fix that? – CoolPenguin Jul 05 '17 at 18:23
  • 5
    Does anyone know what that `00000` and `00001` numbers mean? in `variables.data-?????-of-?????` file – Ivan Talalaev Sep 07 '18 at 14:44
63
  • meta file: describes the saved graph structure, includes GraphDef, SaverDef, and so on; then apply tf.train.import_meta_graph('/tmp/model.ckpt.meta'), will restore Saver and Graph.

  • index file: it is a string-string immutable table(tensorflow::table::Table). Each key is a name of a tensor and its value is a serialized BundleEntryProto. Each BundleEntryProto describes the metadata of a tensor: which of the "data" files contains the content of a tensor, the offset into that file, checksum, some auxiliary data, etc.

  • data file: it is TensorBundle collection, save the values of all variables.

Guangcong Liu
  • 805
  • 1
  • 8
  • 6
  • I have got the pb file that I have for image classification. Can I use it for realtime video classification? –  Jul 29 '17 at 15:37
  • Can you please let me know, Using Keras 2, how do I load the model if it is saved as 3 files ? – rajkiran Oct 11 '17 at 23:23
6

I am restoring trained word embeddings from Word2Vec tensorflow tutorial.

In case you have created multiple checkpoints:

e.g. files created look like this

model.ckpt-55695.data-00000-of-00001

model.ckpt-55695.index

model.ckpt-55695.meta

try this

def restore_session(self, session):
   saver = tf.train.import_meta_graph('./tmp/model.ckpt-55695.meta')
   saver.restore(session, './tmp/model.ckpt-55695')

when calling restore_session():

def test_word2vec():
   opts = Options()    
   with tf.Graph().as_default(), tf.Session() as session:
       with tf.device("/cpu:0"):            
           model = Word2Vec(opts, session)
           model.restore_session(session)
           model.get_embedding("assistance")
Jose Kj
  • 2,912
  • 2
  • 28
  • 40
Steven Wong
  • 131
  • 1
  • 3
  • 1
    What does it mean by "00000-of-00001" in "model.ckpt-55695.data-00000-of-00001" ? – hafiz031 Jul 18 '20 at 14:18
  • 1
    @hafiz031 the suffix `.data-00000-of-00001` references the shard used by Tensorflow in a scenario where you are training on multiple machines. For training on a single machine you'll only have this suffix. – j2abro Dec 09 '21 at 18:20
  • how can i read the weights from usig only the .index and .data file? coz i am not able to load the weights using the .load_weights mehotd? – n0obcoder Feb 15 '22 at 11:02
  • I am trying to freeze a model and I have many files like these: model.ckp-6517.data-00000-of-00001, model.ckp-8145.data-00000-of-00001 and respective .index and .meta files. How can I freeze a model in presence of multiple files like these? what is the meaning of this? – Marcel Sep 20 '22 at 09:25
0

If you trained a CNN with dropout, for example, you could do this:

def predict(image, model_name):
    """
    image -> single image, (width, height, channels)
    model_name -> model file that was saved without any extensions
    """
    with tf.Session() as sess:
        saver = tf.train.import_meta_graph('./' + model_name + '.meta')
        saver.restore(sess, './' + model_name)
        # Substitute 'logits' with your model
        prediction = tf.argmax(logits, 1)
        # 'x' is what you defined it to be. In my case it is a batch of RGB images, that's why I add the extra dimension
        return prediction.eval(feed_dict={x: image[np.newaxis,:,:,:], keep_prob_dnn: 1.0})
happy_sisyphus
  • 1,693
  • 1
  • 18
  • 27