I am trying to teach myself how to write faster code (code using less instructions). I want to create an artificial neural network (ANN). If you know nothing about ANNs, you may still be able to help me as my question pertains more to writing faster code than ANNs. Basically, I am going to have a big array of doubles that I need to perform lots of math on. I could allocate my arrays like this:
class Network {
double *weights;
double *outputs;
public:
Network()
{
}
Network(int * sizes, int numofLayers)
{
int sum = 0;
int neuron_count = 0;
// this just ensures my weight array is the right size
for(int i = 0; i < numofLayers-1; i++)
{
neuron_count += sizes[i];
sum = sum + sizes[i]*sizes[i+1];
}
neuron_count += sizes[numofLayers-1];
weights = new double[sum];
outputs = new double[neuron_count];
}
~Network()
{
delete[] weights;
delete[] outputs;
}
};
However, I dislike this because I use "new" and I know I will probably open myself up to a bunch of memory management problems later on. I know that stack allocation is faster and I shouldn't use dynamic memory if I can help it based on this excerpt:
"Dynamic memory allocation is done with the operators new and delete or with the functions malloc and free. These operators and functions consume a significant amount of time. A part of memory called the heap is reserved for dynamic allocation. The heap can easily become fragmented when objects of different sizes are allocated and deallocated in random order. The heap manager can spend a lot of time cleaning up spaces that are no longer used and searching for vacant spaces. This is called garbage collection. Objects that are allocated in sequence are not necessarily stored sequentially in memory. They may be scattered around at different places when the heap has become fragmented. This makes data caching inefficient"
Optimizing software in C++ An optimization guide for Windows, Linux and Mac platforms By Agner Fog.
However, the arrays weights and outputs will be used by many functions I'd like to create in the class Network; if they are local and get deallocated, that won't work. I feel stuck either using the new keyword or just making a gigantic function for pretty much the entire neural network.
In a normal circumstance, I would say readability would be more important for upkeeping code, but I am not worried about that as this is more to learn about writing fast code. If people who write code for time-heavy algorithms write big functions in order to make things fastest, that's fine. I'd just like to know.
On the other hand, if my arrays are only allocated once and will be used throughout the whole program, is it smarter to use heap allocation because that problematic overhead should only happen once? Then focus on using intrinsics or something in the math heavy code. Are there any other downsides like, for example, if I access the heap a lot so that the memory there is moved to the cache, is this more intensive than accessing the memory on the stack a lot so that the stack is moved to the cache (even if the heap should stay constant)? Do people who write very fast code absolutely avoid new always? If so, what alternatives do I have to organizing my program this way and still keep my arrays an arbitrary size specified by the user?