0

The following code snippet crashes on the second last line, so where tf.train.latest_checkpoint) is called:

import tensorflow as tf
from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm
import quaternion

latest_checkpoint = tf.train.latest_checkpoint('checkpoints/default_model/run_000')
print(latest_checkpoint)

The ouput is:

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
*** Error in `.../anaconda3/envs/tensorflow_env/bin/python': double free or corruption (!prev): 0x0000000001c7a850 ***

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

I am using Python 3.5.2, Tensorflow GPU-version 0.12 in a conda virtual environment, Ubuntu 14.04. The import quaternion statement refers to the external library called numpy-quaternion.

The error does not happen if either the batch_norm or the quaternion import is omitted (so second or third line in the above snippet). Does somebody know why this is happening and how to fix it?

kafman
  • 2,862
  • 1
  • 29
  • 51

1 Answers1

1

There are two possible ways to work around/fix the error:

Don't import batch_norm

Just always use tf.contrib.layers.python.layers.batch_norm directly in the code, thus omitting the import statement (admittedly, creates a lot of clutter).

Set environment variable LD_PRELOAD

The following fix posted by dennybritz on February 10th in this github issue helped:

sudo apt-get install libtcmalloc-minimal4
export LD_PRELOAD="/usr/lib/libtcmalloc_minimal.so.4"

Note that if you're using PyCharm, you either have to specify this environment variable in the run configurations (see this post) or - if you put the above export statement into your .bashrc - you have to start PyCharm from the command line so that it inherits the environment variables (as explained in this post).

However, while this fixes the issue, I don't know why this error occurs in the first place and if it should be considered a bug that should be reported to either tensorflow or numpy-quaternion devs.

Community
  • 1
  • 1
kafman
  • 2,862
  • 1
  • 29
  • 51