4

I am working in a Deep Learning project where I am trying different CNN architectures with CIFAR10. I've built some custom functions and do some nested foor-loops to iterate over my different architectures. The problem I get is that the 12GB of RAM get close to 100% and I cannot free that space to continue. I would like a solution different to "reset your runtime environment", I want to free that space, given that 12GB should be enough for what I am doing, if you manage it correctly.

What I've done so far:

  • Added gc.collect() at the end of each training epoch
  • Added keras.backend.clear_session() after each model is trained

I've also tried to see the locals() using

import sys
def sizeof_fmt(num, suffix='B'):
    ''' by Fred Cirera,  https://stackoverflow.com/a/1094933/1870254, modified'''
    for unit in ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']:
        if abs(num) < 1024.0:
            return "%3.1f %s%s" % (num, unit, suffix)
        num /= 1024.0
    return "%.1f %s%s" % (num, 'Yi', suffix)

for name, size in sorted(((name, sys.getsizeof(value)) for name, value in locals().items()),
                         key= lambda x: -x[1])[:10]:
    print("{:>30}: {:>8}".format(name, sizeof_fmt(size)))

Which yields

  • xtrain: 1.1 GiB
  • xtest: 234.4 MiB
  • _i13: 3.4 KiB

So I cannot understand how the other 10GB are allocated in my current session.

nico_so
  • 138
  • 3
  • 18
  • What is your batch size? and are you using data generator? – claymorehack Oct 01 '21 at 13:27
  • CIFAR 10 is a small dataset, it doesn't need 12 GB RAM, there must be something wrong with your code. – Adarsh Wase Oct 01 '21 at 13:29
  • Batch size is 100. But the RAM remains allocated even after I finished training some models. It is not releasing the memory when it is done, but I cannot figure out with what or how. – nico_so Oct 01 '21 at 13:33
  • @nico_so what is your model's total param?, After the traning, google colab will free the resources (sometimes it take couple of min). If you dont use data generator, the whole data is passing through the ram, therefore you can face up ram problem. – claymorehack Oct 01 '21 at 13:59
  • Hi! The biggest model is about 8700202 params. I am not using data generator, I thought Cifar 10 is small enough to be handled. – nico_so Oct 01 '21 at 14:30

0 Answers0