I have the following code to turn a bit into a byte.
__device__ UINT64 bitToByte(const UINT8 input) {
UINT64 b = ((0x8040201008040201ULL * input) >> 7) & 0x0101010101010101ULL;
//reverse the byte order <<-- this step is missing
return b;
}
However the bytes are in the wrong order, the endianness is reversed.
On the CPU I can simply to a bswap reg,reg
to fix this, but what do I do on the GPU?
Alternatively, what similar trick can I use so that the bytes are put the right way round, i.e. the Most Significant bit goes to the Most Significant Byte, such that I don't need a bswap trick.