batch size setting during training

Question

When I am training using keras + tensorflow-gpu, and I set the batch_size to 128, which is the max size the gpu could accept, otherwise, there's OOM problem. My question is when the batch_size is 128, the pics size is 128*224*224*3*4(the img size is 224*224, in RGB channel), total is around 10M Bytes, which I think is too small compared to the memory of GPU. Is there any explanation for it?

score 1 · Accepted Answer · edited Jul 18 '18 at 17:47

1

You are forgetting 3 more things which also require GPU memory.

Your Model weights.
Temporary variables during calculation of gradients.

These two take up a huge chunk of memory. This is why even though your batch consumes 10M.

There are so many other minute things that require GPU memory.

edited Jul 18 '18 at 17:47

BattleTested_закалённый в бою

4,037
5
26
47

answered Jul 18 '18 at 17:44

Vikas NS

408
5
19

1

great explanation – Xingyu Gu Jul 18 '18 at 19:07

Mohbat Tharani · Answer 2 · 2018-07-19T10:16:18.497

1

The image is uint where is tensor is float64 which increases size by eight times. Forward path, gradients, and other tensors use a significant chunk of memory.

You can compute memory required for your model as given here

edited Jul 19 '18 at 10:16

answered Jul 18 '18 at 18:39

Mohbat Tharani

550
1
6
22

batch size setting during training

2 Answers2