I have a program which works with 10 million variables max because of memory limitations. I need to make it 20 million but with same memory.
So what is the best way to do that in C++?
are there any libraries for it?
and does the calculations with half data types consume less time?
also mention if Cuda Supports the half data types