4

As per the tensorflow team suggestion, I'm getting used to tensorflow's eager execution with tf.keras. However, whenever I train a model, I receive a warning (EDIT: actually, I receive this warning repeated many times, more than once per training step, flooding my standard output):

E tensorflow/core/common_runtime/bfc_allocator.cc:373] tried to deallocate nullptr

The warning doesn't seem to affect the quality of the training but I wonder what it means and if it is possible to get rid of it.

I use a conda virtual environment with python 3.7 and tensorflow 1.12 running on a CPU. (EDIT: a test with python 3.6 gives the same results.) A minimal code that reproduces the warnings follows. Interestingly, it is possible to comment the line tf.enable_eager_execution() and see that the warnings disappear.

import numpy as np
import tensorflow as tf

tf.enable_eager_execution()
N_EPOCHS = 50
N_TRN = 10000
N_VLD = 1000

# the label is positive if the input is a number larger than 0.5
# a little noise is added, just for fun
x_trn = np.random.random(N_TRN)
x_vld = np.random.random(N_VLD)
y_trn = ((x_trn + np.random.random(N_TRN) * 0.02) > 0.5).astype(float)
y_vld = ((x_vld + np.random.random(N_VLD) * 0.02) > 0.5).astype(float)

# a simple logistic regression
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(1, input_dim=1))
model.add(tf.keras.layers.Activation('sigmoid'))

model.compile(
    optimizer=tf.train.AdamOptimizer(),
    # optimizer=tf.keras.optimizers.Adam(),  # doesn't work at all with tf eager execution
    loss='binary_crossentropy',
    metrics=['accuracy']
)

# Train model on dataset
model.fit(
    x_trn, y_trn,
    epochs=N_EPOCHS,
    validation_data=(x_vld, y_vld),
)
model.summary()
Gianluca Micchi
  • 1,584
  • 15
  • 32
  • 1
    Try check this: TensorFlow requires Python 3.4, 3.5, or 3.6 https://www.tensorflow.org/install/pip#1.-install-the-python-development-environment-on-your-system – Manualmsdos Feb 14 '19 at 16:20
  • Hi, I checked the python version and, funnily enough, it says 3.6.8. It feels like tensorflow silently changed the main interpreter version upon installation. Anyway, I tried with a brand new 3.6 conda environment and the problem still exists. – Gianluca Micchi Feb 15 '19 at 11:22

1 Answers1

1

Quick solutions:

  • It did not appear when I ran the same script in TF 1.11 while the optimization was performed to reach the same final validation accuracy on a synthetic dataset.

    OR

  • Suppress the errors/warning using the native os module (adapted from https://stackoverflow.com/a/38645250/2374160). ie; by setting the Tensorflow logging environment variable to not show any error messages.

      import os
      os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
      import tensorflow as tf
    

More info:

  • Solving this error in the correct way may require familiarity with MKL library calls and its interfacing on Tensorflow which is written in C (this is beyond my current TF expertise)

  • In my case, this memory deallocation error occurred whenever the apply_gradients() method of an optimizer was called. In your script, it is called when the model is being fitted to the training data.

  • This error is raised from here: tensorflow/core/common_runtime/mkl_cpu_allocator.h

I hope this helps as a temporary solution for convenience.