Sample code that moves 16 even bits to top half and 16 odd bits to lower half of a uint32_t number:
uint32_t separateBits(uint32_t x)
{
uint32_t even = 0, odd = 0;
for (int i=0; i<32; i+=2)
{
even |= (x & (1 << (i+0))) >> (0 + i/2);
odd |= (x & (1 << (i+1))) >> (1 + i/2);
}
return (even << 16) | odd;
}
Are there any more off efficient approaches, or any bit hacks that could be used to efficiently perform the same operation? This is mainly for mobiles, so if it's ok to use arm specific instructions.
It looks like clang already unrolls loops when compiling to arm, as generated code doesn't look anywhere like actual c-code.
For example, on intel this can be done this way:
uint32_t separateBits(uint32_t x)
{
uint32_t even = _pext_u32(x, 0x55555555);
uint32_t odd = _pext_u32(x, 0xaaaaaaaa);
return (even << 16) | odd;
}