I am running a program, where I keep incrementing "long int" in a big loop (the value is expected up to 10^8). The "long int" in question is initialized to 0. My console print looks like this :
errorous messages : 400000/5000 = 800000 instances
Please notice the division is incorrect. Code printing the line above :
std::cout << "errorous messages : " << total_error << "/" << GRID_SIZE << " = " << (long)((long)total_error / (long)GRID_SIZE) << " instances" << std::endl;
Where the variables in question are :
#define BLOCKS 50
#define THREADS 100
#define GRID_SIZE BLOCKS*THREADS
and
long int total_error; <--- incremented in a loop (never decremented, no overflow)
What I have tried
I have played around with recasting the division of (long)((long)total_error / (long)GRID_SIZE)
to (long)(total_error / GRID_SIZE)
and some others, the result is the same.
Compilation Info
/opt/ohpc/pub/mpi/openmpi-gnu/1.10.6/bin/mpicxx
-I../../common/inc -I/usr/local/cuda-8.0/include
-I/export/home/ra2/test_cuda/test_cuda_v0_1/source_code
-I/export/home/ra2/test_cuda/test_cuda_v0_1/source_code/Utility
-I/export/home/ra2/test_cuda/test_cuda_v0_1/source_code/Data_objects
-I/export/home/ra2/test_cuda/test_cuda_v0_1/source_code/cereal
-std=c++11 -o main.o -c main.cpp
I am compiling with OpenMPI. There is CUDA as well, but this is main.cpp and there is no CUDA code.
QUESTION
What am I missing ? Why do I get wrong result for such a trivial operation ?
Justification of the question
The proposed duplicate is not related to my issue, as it defines macro functions, furthermore it doesnt explain why pre-processor behaves the way it does. It merely explains the way the macro function is executed.
My question is well explained and backed up by code producing the explained behavior. Please see how people answering this question had no issues understanding the cause of the problem.