3

I want to stretch a mask in which every bit represents 4 bits of stretched mask. I am looking for an elegant bit manipulation to stretch using c++ and systemC

for example:

input:

mask (32 bits) = 0x0000CF00

output:

stretched mask (128 bits) = 0x00000000 00000000 FF00FFFF 00000000

and just to clarify the example let's look at the the byte C:

0xC = 1100 after stretching: 1111111100000000 = 0xFF00
Noa Yehezkel
  • 468
  • 2
  • 4
  • 20
  • Is `_pdep_u32` allowed? – harold Feb 06 '17 at 16:02
  • Do you want to stretch any amount of bits like 17, 78, ... or do you only need multiples of 16 or 32? – izlin Feb 06 '17 at 16:13
  • Can you explain what the underlying problem that you're actually solving is? Stretching a mask like this sounds like a very strange operation. – abelenky Feb 06 '17 at 17:07
  • 1
    http://stackoverflow.com/questions/11815894/how-to-read-write-arbitrary-bits-in-c-c/27592777#27592777 – dtech Feb 07 '17 at 06:16

5 Answers5

3

Do this in a elegant form is not easy. The simple mode maybe is create a loop with shift bit

sc_biguint<128> result = 0;
for(int i = 0; i < 32; i++){
    if(bit_test(var, i)){
        result +=0x0F;
    }
    result << 4;
}
Donald Duck
  • 8,409
  • 22
  • 75
  • 99
rodrigo
  • 342
  • 1
  • 2
  • 12
3

Here's a way of stretching a 16-bit mask into 64 bits where every bit represents 4 bits of stretched mask:

uint64_t x = 0x000000000000CF00LL;

x = (x | (x << 24)) & 0x000000ff000000ffLL;
x = (x | (x << 12)) & 0x000f000f000f000fLL;
x = (x | (x << 6)) & 0x0303030303030303LL;
x = (x | (x << 3)) & 0x1111111111111111LL;
x |= x << 1;
x |= x << 2;

It starts of with the mask in the bottom 16 bits. Then it moves the top 8 bits of the mask into the top 32 bits, like this:

0000000000000000 0000000000000000 0000000000000000 ABCDEFGHIJKLMNOP

becomes

0000000000000000 00000000ABCDEFGH 0000000000000000 00000000IJKLMNOP

Then it solves the similar problem of stretching a mask from the bottom 8 bits of a 32 bit word, to the top and bottom 32-bits simultaneously:

000000000000ABCD 000000000000EFGH 000000000000IJKL 000000000000MNOP

Then it does it for 4 bits inside 16 and so on until the bits are spread out:

000A000B000C000D 000E000F000G000H 000I000J000K000L 000M000N000O000P

Then it "smears" them across 4 bits by ORing the result with itself twice:

AAAABBBBCCCCDDDD EEEEFFFFGGGGHHHH IIIIJJJJKKKKLLLL MMMMNNNNOOOOPPPP

You could extend this to 128 bits by adding an extra first step where you shift by 48 bits and mask with a 128-bit constant:

x = (x | (x << 48)) & 0x000000000000ffff000000000000ffffLLL;

You'd also have to stretch the other constants out to 128 bits just by repeating the bit patterns. However (as far as I know) there is no way to declare a 128-bit constant in C++, but perhaps you could do it with macros or something (see this question). You could also make a 128-bit version just by using the 64-bit version on the top and bottom 16 bits separately.

If loading the masking constants turns out to be a difficulty or bottleneck you can generate each one from the previous one using shifting and masking:

uint64_t m = 0x000000ff000000ffLL;

m &= m >> 4; m |= m << 16;  // gives 0x000f000f000f000fLL
m &= m >> 2; m |= m << 8;  // gives 0x0303030303030303LL
m &= m >> 1; m |= m << 4; // gives 0x1111111111111111LL
Community
  • 1
  • 1
samgak
  • 23,944
  • 4
  • 60
  • 82
  • The last two instructions `(x |= x<<2; x |= x<<1)` can be replaced by `x*=0xf` – MSalters Feb 22 '17 at 16:31
  • It looks like the masks can also be pairwised combined. That is to say, you can start with `(x *= (1+1ULL<<12+1ULL<<24+1ULL<<36)`. You'll have a few positions where bits collide, but you mask those out anyway. – MSalters Feb 22 '17 at 16:34
2

Does this work for you?

#include <stdio.h>

long long Stretch4x(int input)
{
    long long output = 0;

    while (input & -input)
    {
        int b = (input & -input);
        long long s = 0;
        input &= ~b;
        s = b*15;
        while(b>>=1)
        {
            s <<= 3;
        }

        output |= s;
    }
    return output;  
}

int main(void) {
    int input = 0xCF00;

    printf("0x%0x ==> 0x%0llx\n", input, Stretch4x(input));
    return 0;
}

Output:

0xcf00 ==> 0xff00ffff00000000
abelenky
  • 63,815
  • 23
  • 109
  • 159
2

The other solutions are good. However, most them are more C than C++. This solution is pretty straight forward: it uses std::bitset and set four bits for each input bit.

#include <bitset>
#include <iostream>

std::bitset<128> 
starch_32 (const std::bitset<32> &input)
{
    std::bitset<128> output;

    for (size_t i = 0; i < input.size(); ++i) {
        // If `input[N]` is `true`, set `output[N*4, N*4+4]` to true.
        if (input.test (i)) {
            const size_t output_index = i * 4;

            output.set (output_index);
            output.set (output_index + 1);
            output.set (output_index + 2);
            output.set (output_index + 3);
        }
    }

    return output;
}

// Example with 0xC. 
int main() {
    std::bitset<32> input{0b1100};

    auto result = starch_32 (input);

    std::cout << "0x" << std::hex << result.to_ullong() << "\n";
}

Try it online!

Shmuel H.
  • 2,348
  • 1
  • 16
  • 29
1

On x86 you could use the PDEP intrinsic to move the 16 mask bits into the correct nibble (into the low bit of each nibble, for example) of a 64-bit word, and then use a couple of shift + or to smear them into the rest of the word:

unsigned long x = _pdep_u64(m, 0x1111111111111111);
x |= x << 1;
x |= x << 2;

You could also replace those two OR and two shift by a single multiplication by 0xF which accomplishes the same smearing.

Finally, you could consider a SIMD approach: solutions such as samgak's above should map naturally to SIMD.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386