gcloud jobs submit prediction 'can't decode json' with --data-format=TF_RECORD

Question

I pushed up some test data to gcloud for prediction as a binary tfrecord-file. Running my script I got the error ('No JSON object could be decoded', 162). What do you think I am doing wrong?

To push a prediction job to gcloud, i use this script:

REGION=us-east1
MODEL_NAME=mymodel
VERSION=v_hopt_22
INPUT_PATH=gs://mydb/test-data.tfr
OUTPUT_PATH=gs://mydb/prediction.tfr
JOB_NAME=pred_${MODEL_NAME}_${VERSION}_b

args=" --model "$MODEL_NAME
args+=" --version "$VERSION

args+=" --data-format=TF_RECORD"
args+=" --input-paths "$INPUT_PATH
args+=" --output-path "$OUTPUT_PATH

args+=" --region "$REGION

gcloud ml-engine jobs submit prediction $JOB_NAME $args

test-data.tfr has been generated from a numpy array, as so:

import numpy as np

filename = './Datasets/test-data.npz'
data = np.load(filename)
features = data['X'] # features[channel, example, feature]
np_features = np.swapaxes(features, 0, 1) # features[example, channel, feature]

import tensorflow as tf
import nnscoring.data as D

def floats_feature(arr):
    return tf.train.Feature(float_list=tf.train.FloatList(value=arr.flatten().tolist()))

writer = tf.python_io.TFRecordWriter("./Datasets/test-data.tfr")

for i, np_example in enumerate(np_features):
    if i%1000==0: print(i)
    tf_feature = {  
        ch: floats_feature(x)
        for ch, x in zip(D.channels, np_example)
    }
    tf_features = tf.train.Features(feature=tf_feature)
    tf_example = tf.train.Example(features=tf_features)
    writer.write(tf_example.SerializeToString())

writer.close()

Update (following yxshi):

I define the following serving function

def tfrecord_serving_input_fn():
    import tensorflow as tf
    seq_length = int(dt*sr) 
    examples = tf.placeholder(tf.string, shape=())
    feat_map = {
        channel: tf.FixedLenSequenceFeature(shape=(seq_length,),
            dtype=tf.float32, allow_missing=True)
        for channel in channels
    }
    parsed = tf.parse_single_example(examples, features=feat_map)
    features = {
        channel: tf.expand_dims(tensor, -1)
        for channel, tensor in parsed.iteritems()
    }
    from collections import namedtuple
    InputFnOps = namedtuple("InputFnOps", "features labels receiver_tensors")
    tf.contrib.learn.utils.input_fn_utils.InputFnOps = InputFnOps
    return InputFnOps(features=features, labels=None, receiver_tensors=examples)
    # InputFnOps = tf.contrib.learn.utils.input_fn_utils.InputFnOps
    # return InputFnOps(features, None, parsed)
    # Error: InputFnOps has no attribute receiver_tensors

.., which I pass to generate_experiment_fn as so:

export_strategies = [
      saved_model_export_utils.make_export_strategy(
          tfrecord_serving_input_fn,
          exports_to_keep = 1,
          default_output_alternative_key = None,
  )]

  gen_exp_fn = generate_experiment_fn(
      train_steps_per_iteration = args.train_steps_per_iteration,
      train_steps        = args.train_steps,
      export_strategies  = export_strategies
  )

(aside: note the dirty patch of InputFnOps)

something is not working. I tried it too with tfrecords and I get the same error. I opened a ticket. — fabrizioM, Sep 20 '17 at 09:54

score 0 · Answer 1 · answered Aug 31 '17 at 18:16

0

It looks like the input is not correctly specified in the inference graph. To use tf_record as input data format, your inference graph must accept strings as the input placeholder. In your case, you should have something like below in your inference code:

 examples = tf.placeholder(tf.string, name='input', shape=(None,))
 with tf.name_scope('inputs'):
   feature_map = {
     ch: floats_feature(x)
     for ch, x in zip(D.channels, np_example)
   }
   parsed = tf.parse_example(examples, features=feature_map)
   f1 = parsed['feature_name_1']
   f2 = parsed['feature_name_2']

 ...

A close example is here: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/flowers/trainer/model.py#L253

Hope it helps.

answered Aug 31 '17 at 18:16

yxshi

244
1
5

yxshi, i digged into your advice today, and ended up with the update shown above; still no luck, though I did manage to read the written tfrecords file offline. One thing that's holding me back is that I don't understand what happens with the input file when issuing `gcloud ml-engine jobs submit prediction $job --data-format TF_RECORDS ..`. Could you elaborate a bit? – Jus Sep 01 '17 at 21:04
To better understand what happens when you issue the "gcloud ml-engine jobs submit prediction" command, it's worthwhile reading all related background information, as well as the info pertinent to the command, from the "Cloud Machine Learning Engine (Cloud ML Engine) Getting Started" documentation page: https://cloud.google.com/ml-engine/docs/how-tos/getting-started-training-prediction The strictly relevant sub-chapter is titled: "Submit a batch prediction job". – George Sep 15 '17 at 20:22
Thanks for your headsup, George. I'm well aware of the chapter you point me to. Submission of json-formatted feature lists works peachy, however I did not get to work a submission of a .tfrecord file. – Jus Sep 24 '17 at 03:10

gcloud jobs submit prediction 'can't decode json' with --data-format=TF_RECORD

1 Answers1

Linked