0

I have successfully implemented and run an autoencoder on image data (MNIST digits). I use Spyder through Anaconda Navigator. I'm running Python 3.7.1.

I constructed a simple CNN following vetted examples. My code executes through completion of the model and loading of training data (in this case, CIFAR10). When I call model.fit() the code crashes with no error and leaving no variables in the kernel.

  1. How might I monitor execution of this code to better understand why it is crashing?
  2. Have I coded something incorrectly that is causing the crash? Or, perhaps is this an environment or memory error?

I have copied similar code from presumably working CNN examples and replicated the behavior with published code (Although my autoencoder code works in the same environment).

Here is the relevant section of my code:

from keras.layers import Input, Dense, Flatten, Conv2D, MaxPooling2D
from keras.models import Model
from keras.utils import to_categorical
from keras.datasets import cifar10

proceedtofit = True

#define input shape

input=Input(shape=(32,32,3))

#define layers
predictions=Conv2D(16,(3,3),activation='relu',padding='same')(input)
predictions=MaxPooling2D(pool_size=(2,2),strides=None,padding='same')(predictions)
predictions=Conv2D(4,(3,3),activation='relu',padding='same')(predictions)
predictions=MaxPooling2D(pool_size=(2,2),strides=None,padding='same')(predictions)
predictions=Flatten()(predictions)  
predictions=Dense(32,activation='relu')(predictions)
predictions=Dense(10,activation='sigmoid')(predictions)

#integrate into model

model=Model(inputs=input,outputs=predictions)
#print("Succesfully integrated model.")
model.summary()

#compile (choose optimizer and loss function)
model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer='adam')

#input training and test data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Convert class vectors to binary class matrices.
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

#train model

if proceedtofit:
    model.fit(x_train, y_train, batch_size=10, epochs=50, shuffle=True,
          validation_data=(x_test, y_test))

print("Finished fit.")

The code executes in the kernel and produces the expected model summary. If proceedtofit is False, then the code exits gracefully. If proceedtofit is True, then the code calls the model.fit() method and crashes. The verbose output start to finish is:

Python 3.7.0 (default, Jun 28 2018, 07:39:16)
Type "copyright", "credits" or "license" for more information.

IPython 7.2.0 -- An enhanced Interactive Python.

runfile('/Users/Fox/Documents/Python Machine Learning/convclass.py', wdir='/Users/Fox/Documents/Python Machine Learning')
WARNING:tensorflow:From /Applications/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Using TensorFlow backend.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 16)        448       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 4)         580       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 4)           0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                8224      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                330       
=================================================================
Total params: 9,582
Trainable params: 9,582
Non-trainable params: 0
_________________________________________________________________
(50000, 1)
(50000, 10)
WARNING:tensorflow:From /Applications/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 50000 samples, validate on 10000 samples
Epoch 1/50
2019-08-04 16:32:52.400023: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-08-04 16:32:52.400364: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 8. Tune using inter_op_parallelism_threads for best performance.

At this point, the code exits and returns me to the kernel prompt. Training (fitting) did not execute, and returned no error. The model is no longer present in memory. That is, calling model.summary() at the prompt yields the following error:

[1]:model.summary()
Traceback (most recent call last):

  File "<ipython-input-1-5f15418b3570>", line 1, in <module>
    model.summary()

NameError: name 'model' is not defined

Following a comment, I ran the code in a terminal. I did get more verbose output and an error report. I don't understand it yet, but at least it is a place to start. Thoughts? (See below.)

   OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.
    OMP: Hint: This means that multiple copies of the OpenMP runtime have been 
linked into the program. That is dangerous, since it can degrade performance or 
cause incorrect results. The best thing to do is to ensure that only a single 
OpenMP runtime is linked into the process, e.g. by avoiding static linking of the 
OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround 
you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program 
to continue to execute, but that may cause crashes or silently produce incorrect 
results. For more information, please see 
http://www.intel.com/software/products/support/.
Abort trap: 6

I found this. Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized Looks promising. I will explore the suggestions offered and then perhaps the question should be combined with the other discussion?

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Kokomodo
  • 81
  • 1
  • 7
  • run code in normal way - in console/terminal/cmd.exe run `python script.py`. It may show error message in console/terminal/cmd.exe – furas Aug 04 '19 at 23:27
  • message show something about update and deprecated function. Maybe you have to update `Keras` – furas Aug 04 '19 at 23:30
  • 1
    on [Keras page](https://keras.io/) you can see `"Keras is compatible with: Python 2.7-3.6."`. So it may not be ready to work with Python 3.7 – furas Aug 04 '19 at 23:34
  • You might be running into memory problems. Could you try allocating more? – rvinas Aug 05 '19 at 09:48
  • Thank you, furas. Running in the terminal opened up more information. – Kokomodo Aug 05 '19 at 17:08

1 Answers1

-1

After running the code in a command shell rather than Spyder, I captured the error and identified a related question that had already been answered.

Based on the discussion in: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized I removed tensorflow using conda remove tensorflow and then reinstalled tensorflow and keras using

conda install -c tensorflow

and

conda install -c keras

I then reran the code and everything worked in both the command shell and in Spyder.

Kokomodo
  • 81
  • 1
  • 7
  • I've tried using `conda install -c tensorflow` and `conda install -c keras` but I get the error `CondaValueError: too few arguments, must supply command line package specs or --file`. Am I missing something? – rzaratx Nov 14 '19 at 19:44