I am implementing fixed point arithmetic using the slides [1]. Everything is working as it should my question is the slides linked, also every resource I read says when multiplying and dividing fixed point numbers there is a good change they overflow. So they suggest casting them to a bigger size to multiply then cast back. Like,
(INT32)(((INT64)a *
(INT64)b) >> N)
instead of just,
((a * b) >> N)
This works for 8,16,32 bit integers but how do I handle overflow for 64 bit integer? There is no 128 bit int type (AFAIK gcc has 128 bit integers but they are not portable.)
I also would like to via the constructor auto calculate required bits for the user supplied epsilon (minimum required fraction accuracy) Like so,
If 0.01 accuracy is required 6 bits is enough for N. (Since 1/64 = 0.015) I couldn't figure out the logic for converting accuracy to required bits?
[1] http://jet.ro/files/The_neglected_art_of_Fixed_Point_arithmetic_20060913.pdf