Keras using too much memory

Question

I have a keras (with tensorflow backend) model which is defined like so:

INPUT_SHAPE = [4740, 3540, 1]

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=INPUT_SHAPE))
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

This model has only 37,506 trainable params. Yet somehow it is able to deplete K80's 12GB vram resource on model.fit() if a batch size is more then 1. Why does this model need so much memory? And how do I calculate memory requirements properly? The function from How to determine needed memory of Keras model? gives me 2.15 GB per 1 element in a batch. So at least I should be able to make a batch of 5.

EDIT: model.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 4738, 3538, 32)    320       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4735, 3535, 2)     1026      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2)      0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 1180, 880, 4)      132       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 292, 217, 8)       520       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 70, 51, 16)        2064      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 9, 32)         8224      
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32)          0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 3, 2, 32)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 192)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               24704     
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0
_________________________________________________________________

How do you run your experiment? Are you sure that your memory is freed after a single model compilation? — Marcin Możejko, Oct 24 '17 at 11:33
@MarcinMożejko Yes, I'm sure that it's freed, I check nvidia-smi before running the model.fit(). And what do you mean by how? — UpmostScarab, Oct 24 '17 at 11:34
E.g. if you run your training in jupyter notebook - you could have problems with garbage collection and freeing the memory of old models. — Marcin Możejko, Oct 24 '17 at 11:36
@MarcinMożejko I did try to restart kernel and also restart notebook before running fit. — UpmostScarab, Oct 24 '17 at 11:39
Running fit in this model with one image takes 35GB here.... — Daniel Möller, Oct 24 '17 at 11:40
@DanielMöller How did you determine that? And why then I can run fit with one image on 12GB? — UpmostScarab, Oct 24 '17 at 11:41
I just looked at my resource manager while running. (CPU, float32) — Daniel Möller, Oct 24 '17 at 11:42
@DanielMöller thank you, this seems to be a suitable but no ideal way to determine requirements. — UpmostScarab, Oct 24 '17 at 11:46
@MarcinMożejko https://gist.github.com/UpmostScarab/1e23244bae13e68eb0a95dae0ee24542 — UpmostScarab, Oct 24 '17 at 12:07
@Paddy well the image size is stated in my question and it's 4740*3540 and with 32 bit precision one image is about 52 MB — UpmostScarab, Oct 24 '17 at 12:31
Keras probably has way more variables than we can see...maybe duplicate helpers for the convolutions, probably gradient vars too... — Daniel Möller, Oct 24 '17 at 13:33

score 1 · Accepted Answer · answered Oct 24 '17 at 14:22

1

The output shape of the first layer is B*4738*3538*32 (B is the batch size), which will take around 1GB * B memory. The gradients and other activations will probably take some memory too. Maybe increasing the stride for the first layer will help.

answered Oct 24 '17 at 14:22

Alexey Umnov

26
1

Thank you, actually I shouldn't have made the first layer so deep. Reducing that helped. – UpmostScarab Oct 24 '17 at 14:24
what is the stride? – Vadim Oct 25 '17 at 12:47

Keras using too much memory

1 Answers1