Fast bit concatenation reversing one of arguments

Question

I need to merge two variables into one like in example below:

x = 0b110101
y = 0b10011

merged(x,y) -> 0b11010111001
merged(y,x) -> 0b10011101011

So, y is reversed and concatenated to x without unnecessary zeros (last bit of result is always 1)

In other words: merged(abcdefg, hijklm) results in abcdefgmlkjih where a and h are 1

What would be the most efficient way to do this in C

EDIT: most efficient = fastest, sizeof(x) = sizeof(y) = 1, CHAR_BIT= 8

EDIT2: Since question has been put on hold I will post summary right here: Further limitations and more detail:

Target language: C with 64-bit operations support

For this exact problem fastest method would be a 256 element lookup table as was suggested in comments, which returns index of high bit N and reversed value of second argument, so we can shift first argument to the left by N and perform bitwise or with reversed value of second argument.

In case someone needs good performance for arguments larger than 8 bit (lookup table is not an option) they could:

Find index of high bit using de Bruijn algorithm. Example for 32 bits:

uint32_t msbDeBruijn32( uint32_t v )
{
    static const int MultiplyDeBruijnBitPosition[32] =
    {
        0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
        8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
    };

    v |= v >> 1; // first round down to one less than a power of 2
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;

    return MultiplyDeBruijnBitPosition[( uint32_t )( v * 0x07C4ACDDU ) >> 27];
}

taken from this question: Find most significant bit (left-most) that is set in a bit array

Reverse bits in bytes (one by one) of the second argument using this bit twiddling hack

unsigned char b; // reverse this (8-bit) byte    
b = (b * 0x0202020202ULL & 0x010884422010ULL) % 1023;

taken from here: http://graphics.stanford.edu/~seander/bithacks.html#BitReverseObvious

So, this question now can be closed/deleted

Do you have to do this with ints, or can you use bit vectors? — Barmar, Sep 25 '18 at 20:23
Depends on the trade-offs you're willing to make, i.e. speed vs. portability and speed vs. code complexity. — user3386109, Sep 25 '18 at 20:23
The trickiest part in portable code is finding the highest 1 bit in `y`. — Barmar, Sep 25 '18 at 20:24
Most efficient? As in uses less time, less memory, less cpu, less instructions, ... less *everything* I guess? I suggest lookup array :) — pmg, Sep 25 '18 at 20:24
See https://stackoverflow.com/questions/53161/find-the-highest-order-bit-in-c for finding the highest bit — Barmar, Sep 25 '18 at 20:25
What is the application for something like this? I have a hard time imagining it outside an academic exercise or code golf. — Barmar, Sep 25 '18 at 20:26
Standard C does not provide operators for locating the highest bit set without some sort of iteration. So there is no standard code for this, and the fastest way to do it will depend heavily on the C implementation used, including its extensions for locating bits set and the hardware it runs on. So there is no single answer. Additionally, speed and efficiency depend on whether this code is run occasionally or on millions of instances of data. Situations I know of where bit fields must be reversed involve fixed-width fields. Value-dependent widths are unusual. — Eric Postpischil, Sep 25 '18 at 20:30
@Barmar that's a board game where pieces of same color are located from the left of current move and from the right of current move. I need to determine a 'figure' — rmflow, Sep 25 '18 at 20:31
@EricPostpischil According to the edit: `sizeof(x) = sizeof(y) = 1`. And yes, I'm deliberately ignoring implementations where `CHAR_BIT` is not 8. — user3386109, Sep 25 '18 at 20:35
@user3386109 Raise your hand if you're old enough to remember when a 64K lookup table was *big*. ✋ — Barmar, Sep 25 '18 at 20:37
Please don't ask how to do this in C and C++ both. Cross-tagging between the two languages is discouraged, as the answers might be quite different (C++ got bitset etc). Instead, pick one language. — Lundin, Sep 25 '18 at 20:43
@Lundin He said **or**. Maybe he's willing to choose the language depending on which can do this best. — Barmar, Sep 25 '18 at 20:45
To be clear, the lookup table I have in mind has two pieces of information, the number of bits to shift the first argument, and the reversed second argument. So the table size is 512 bytes. — user3386109, Sep 25 '18 at 20:46
@Barmar Then there's no need to tag it C++. Anyway, this isn't a pizza store; we don't order a solution in our favourite flavour, we ask about a specific issue in the language we are currently programming in. — Lundin, Sep 25 '18 at 20:48
Intrinsic count leading zeros is often available in modern compilers. — Michael Dorgan, Sep 25 '18 at 20:48
`uint16_t combined = (uint16_t)x << bitsToShift[y] | reversedBits[y];` Computing the tables `uint8_t bitsToShift[256]` and `uint8_t reversedBits[256]` is left as an exercise for the OP. — user3386109, Sep 25 '18 at 22:16

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

Solution

This should do the trick in C++ and C:

unsigned int merged(unsigned char a, unsigned char b) {
    unsigned int r = a; 
    for (; b; b>>=1) 
        r = (r<<1)|(b&1);
    return r;
}

I don't think that there's a more efficient way to do it. Here an online demo.

How does it work ?

This works by building the result by starting with the first argument:

 r is 0b110101;

Then it left-shifts r, which is, using your terminology, like concatenating a 0 at the lowest end of r:

 0b110101 << 1 is  0b1101010;

and simultaneously we get the lowest bit of b with a bitwise and, then set the lowest bit of r to the same value using a bitwise or:

 0b10011 & 1 is 0b00001
 so (0b110101 << 1)|(0b10011 & 1) is 0b1101011;

We then right-shift b to process the next lowest bit in the same way:

 0b10011 >> 1 is 0b1001;

as long as there is some bit set left in b.

Important remarks

You have to be careful about the type used in order to avoid the risk of overflow. So for 2 unsigned char as input you could use unsigned int as output.

Note that using signed integers would be risky, if the left-shift might cause the sign bit to change. This is why unsigned is important here.

would be nice using x, y instead of a, b as asked; you could also remove declareing r, using the a itself; nevertheless compiler would optimize this away. — Tom Kuschel, Sep 25 '18 at 20:58
@TomKuschel I didn't use a itself, because my idea was to ultimately promote the use of a larger return type than the input parameter, to avoid by design risks of overflows (see section("Important remark"). Compiler will anyway do the optimisation. I didn't use x and y because in OP's example they are variables used as arguments and not the function parameters (OP used merged(x,y) and merged(y,x), so using a,b is neutral). — Christophe, Sep 25 '18 at 21:37

Fast bit concatenation reversing one of arguments

1 Answers1

Solution

How does it work ?

Important remarks