2

I have a 2gb Tensorflow model that I'd like to add to a Flask project I have on App Engine but I can't seem to find any documentation stating what I'm trying to do is possible.

Since App Engine doesn't allow writing to the file system, I'm storing my model's files in a Google Bucket and attempting to restore the model from there. These are the files there:

  • model.ckpt.data-00000-of-00001
  • model.ckpt.index
  • model.ckpt.meta
  • checkpoint

Working locally, I can just use

with tf.Session() as sess:
    logger.info("Importing model into TF")
    saver = tf.train.import_meta_graph('model.ckpt.meta')
    saver.restore(sess, model.ckpt)

Where the model is loaded into memory using Flask's @before_first_request.

Once it's on App Engine, I assumed I could to this:

blob = bucket.get_blob('blob_name')
filename = os.path.join(model_dir, blob.name)
blob.download_to_filename(filename)

Then do the same restore. But App Engine won't allow it.

Is there a way to stream these files into Tensorflow's restore functions so the files don't have to be written to the file system?

Dan Cornilescu
  • 39,470
  • 12
  • 57
  • 97
Brendan Martin
  • 561
  • 6
  • 17

2 Answers2

3

After some tips from Dan Cornilescu and digging into it I found that Tensorflow builds the MetaGraphDef with a function called ParseFromString, so here's what I ended up doing:

from google.cloud import storage
from tensorflow import MetaGraphDef

client = storage.Client()
bucket = client.get_bucket(Config.MODEL_BUCKET)
blob = bucket.get_blob('model.ckpt.meta')
model_graph = blob.download_as_string()

mgd = MetaGraphDef()
mgd.ParseFromString(model_graph)

with tf.Session() as sess:
    saver = tf.train.import_meta_graph(mgd)
Brendan Martin
  • 561
  • 6
  • 17
  • I don't see how `saver.restore(sess, model.ckpt)` is catered for in this answer. The metaGraph is restored but how are the weights transferred? – Bryce Ramgovind Dec 01 '20 at 12:39
1

I didn't actually use Tensorflow, the answer is based on docs and GAE-related knowledge.

In general using GCS objects as files in GAE to avoid the lack of a writable filesystem access relies on one of 2 alternate approaches instead of just passing a filename to be directly read/written (which can't be done with GCS objects) by your app code (and/or any 3rd party utility/library it may be using):

In your particular case it seems the tf.train.import_meta_graph() call supports passing a MetaGraphDef protocol buffer (i.e. raw data) instead of the filename from which it should be loaded:

Args:

  • meta_graph_or_file: MetaGraphDef protocol buffer or filename (including the path) containing a MetaGraphDef.

So restoring models from GCS should be possible, something along these lines:

import cloudstorage

with cloudstorage.open('gcs_path_to_meta_graph_file', 'r') as fd:
    meta_graph = fd.read()

# and later:

saver = tf.train.import_meta_graph(meta_graph)

However from the quick doc scan saving/checkpointing the modes back to GCS may be tricky, save() seem to want to want to write the data to disk itself. But I didn't dig too deep.

Dan Cornilescu
  • 39,470
  • 12
  • 57
  • 97
  • I'm running Python 3.6 and it seems cloudstorage might only be for Python 2.7. Up to this point I've been using [google-cloud-storage](https://googleapis.github.io/google-cloud-python/latest/storage/client.html) to get bucket contents. I don't see a way to open the file like you can with your example. – Brendan Martin Nov 05 '18 at 22:04
  • If you export the meta graphs **as text** you could try loading them like `meta_graph = blob.download_as_string()`. And assuming you'd find a way to get the raw `meta_graph` (as text) to be saved, you could save it with `blob.upload_from_string(meta_graph)`. – Dan Cornilescu Nov 06 '18 at 02:59