How to remedy "Segmentation fault (core dumped)" error when trying to fit a keras model in Python (Anaconda) on Ubuntu 18.04

Question

I have a new PC (on Ubuntu 18.04) which has a 2080Ti GPU. I'm trying to get it all up and running in regards to training neural networks in Python using Keras (in an Anaconda environment) but am getting a "Segmentation fault (core dumped)" error when trying to fit the model.

The code I'm using works completely fine at work on my Windows PC (has a 1080Ti GPU). The error seems to be related to GPU memory, and I can see something odd is happening when I run 'nvidia-smi' prior to fitting the model I see around 800mb of the available 11gb GPU memory is being used up, but once I compile the model this available memory is all taken up. In the processes section I can see this is something to do with the anaconda environment (i.e. ...ics-link/anaconda3/envs/py35/bin/python = 9677MiB)

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.25       Driver Version: 415.25       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  On   | 00000000:04:00.0  On |                  N/A |
| 28%   44C    P2    51W / 250W |  10491MiB / 10986MiB |      7%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1507      G   /usr/lib/xorg/Xorg                            30MiB |
|    0      1538      G   /usr/bin/gnome-shell                          57MiB |
|    0      1844      G   /usr/lib/xorg/Xorg                           309MiB |
|    0      1979      G   /usr/bin/gnome-shell                         177MiB |
|    0      3816      G   /usr/lib/firefox/firefox                       6MiB |
|    0      5451      G   ...-token=169F1B80118E535BC5002C22A81DD0FA    90MiB |
|    0      5896      G   ...-token=631C5DCD90ADCF80959770937CE797E7   128MiB |
|    0      6485      C   ...ics-link/anaconda3/envs/py35/bin/python  9677MiB |
+-----------------------------------------------------------------------------+

Here is the code, just for reference:

from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, Activation, BatchNormalization
from keras.callbacks import ModelCheckpoint, CSVLogger
from keras import backend as K
import numpy as np

batch_size = 64
num_classes = 10
epochs = 10

# input image dimensions
img_rows, img_cols = 32, 32

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 3, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 3, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 3)
    input_shape = (img_rows, img_cols, 3)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

# normalise pixel values
mean = np.mean(x_train,axis=(0,1,2,3))
std = np.std(x_train,axis=(0,1,2,3))
x_train = (x_train-mean)/(std+1e-7)
x_test = (x_test-mean)/(std+1e-7)

print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))

model.add(Conv2D(64, (3, 3)))
#model.add(BatchNormalization())
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3)))
#model.add(BatchNormalization())
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(256, (3, 3)))
#model.add(BatchNormalization())
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())

model.add(Dense(1024))
model.add(Activation("relu"))
model.add(Dropout(0.25))

model.add(Dense(1024))
model.add(Activation("relu"))
model.add(Dropout(0.25))

model.add(Dense(1024))
model.add(Activation("relu"))
model.add(Dropout(0.25))

model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

#load weights from previous run
#model.load_weights('model07_weights_best.hdf5')

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0.1,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

# Compute quantities required for feature-wise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(x_train)


#save weights and log
checkpoint = ModelCheckpoint("model14_weights_best.hdf5", monitor='val_acc', verbose=1, save_best_only=True, mode='max')
csv_logger = CSVLogger('model14_loss_log.csv', append=True, separator=';')
callbacks_list = [checkpoint,csv_logger]

# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(x_train, y_train,
                                 batch_size=batch_size),
                                 epochs=epochs,
                                 validation_data=(x_test, y_test),
                                 callbacks = callbacks_list
                                 )

I'm not expecting anything much to take up a great deal of space on the GPU, but it seems to being saturated. As I mention it works on my Windows PC.

Any ideas as to what might cause this?

I'll just add that I'm installing tensorflow-gpu and keras via Anaconda, and because of this I'm installing cuda and cudnn automatically — andrewjones54, Jan 24 '19 at 13:20

smerllo · Answer 1 · 2019-01-27T06:39:14.397

I don’t believe this has something to do with the memory size. I have been dealing with this recently. Segmentation fault error stands for a failure of the parallelization of your training process on the GPU. You wouldn’t have this error if the process was running sequentially no matter how big is your dataset. Also, no need to worry about your deep learning settings either.

Since you are just about to set up a new machine, I believe there must be two reasons for the segmentation fault in your context.

First, I would go and check if my GPU is installed correctly but based on the details you provided, I believe the issue is more about the module (Keras in your case) as a second reason:

In this case, you may have soemthing weird in your installation of the module or one of its dependencies. I would recommend to remove it and clean up everything and reinstall it again.
Are you sure your tensorflow-gpu is installed (properly) ? what about cuda and cudnn?

If you believe keras is correctly installed, try this test code :

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

This will print whether your tensorflow is using a CPU or a GPU backend.

I doubt you will have the segmentation error again if you all above steps went well.

check this reference for tensorflow testing on GPU.

I agree with your statement that this error is not caused by out-of-memory, but disagree with two of your potential reasons. I ran the same project on the same machine successfully but just failed when I changed the model to perform a different task. — Fang WU, Dec 02 '22 at 05:54

chandrakant_k · Answer 2 · 2019-01-26T16:02:28.857

If it's a memory issue then you would be able to train it with lower batch size. Try reducing batch size to 32 and if it doesn't works keep reducing till batch size 1 and observe the GPU usage.

Also add following code at the top of your code, it would dynamically allocate the GPU memory. So you would be able to see how much GPU memory is used/required with smaller batch sizes.

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
config.log_device_placement = True  # to log device placement (on which device the operation ran)
                                    # (nothing gets printed in Jupyter, only if you run it standalone)
sess = tf.Session(config=config)
set_session(sess)  # set this TensorFlow session as the default session for Keras

Source: https://github.com/keras-team/keras/issues/4161

I hope it will help.

How to remedy "Segmentation fault (core dumped)" error when trying to fit a keras model in Python (Anaconda) on Ubuntu 18.04

2 Answers2