0

I'm trying to log the steps during the evaluation using Mlflow but have only been able to log the last step. Using mlflow.tensorflow.autolog() I am able to log some metrics (like loss) when a checkpoint is saved, every 100 steps that is defined in RunConfig. However I also need to save the accuracy and top3error every 100 steps the model is evaluated. Here is my code:

def top3error(features, labels, predictions):
    return {'top3error': tf.metrics.mean(tf.nn.in_top_k(predictions=predictions['logits'], 
                                                        targets=labels,
                                                        k=3))}
# Log metrics
mlflow.tensorflow.autolog()

with mlflow.start_run():
    steps = 1000

    mlflow.log_param("Steps", steps)    

    '''Training & Validation'''
    train_spec = tf.estimator.TrainSpec(input_fn=generate_input_fn(train), 
                                        max_steps=steps)
    eval_spec = tf.estimator.EvalSpec(name='validation',
                                      input_fn=generate_input_fn(test, num_epochs=1))

    tf.logging.info("Starting Run...")
    results = tf.estimator.train_and_evaluate(m, train_spec, eval_spec)    

    '''Log Run'''
    mlflow.log_metric("accuracy", results[0]['accuracy'])
    mlflow.log_metric("top3error", results[0]['top3error'])

Here is the RunConfig used in the model:

config=tf.estimator.RunConfig(
  model_dir=model_dir, 
  save_checkpoints_steps=100,
)

Thanks in advance

Daniel Zapata
  • 812
  • 3
  • 9
  • 31

1 Answers1

1

You can achieve this by specifying the metrics you want to log in your Estimator. Unless you're using some sort of training loop and iterating over step, you wouldn't be able to do this directly.

See https://stackoverflow.com/a/45716062