Is there a well-known and efficient method for vectorizing multiplications of (two arrays of) unsigned 64-bit integers yielding 128-bit integers?
I found this thread which only talks about using a single instruction though.
Is there a well-known and efficient method for vectorizing multiplications of (two arrays of) unsigned 64-bit integers yielding 128-bit integers?
I found this thread which only talks about using a single instruction though.