In C/C++ what's the simplest way to reverse the order of bits in a byte?

Question

While there are multiple ways to reverse bit order in a byte, I'm curious as to what is the "simplest" for a developer to implement. And by reversing I mean:

1110 -> 0111
0010 -> 0100

This is similar to, but not a duplicate of this PHP question.

This is similar to, but not a duplicate of this C question. This question is asking for the easiest method to implement by a developer. The "Best Algorithm" is concerned with memory and cpu performance.

Use inline assembly. Better, put the function into a separate translation unit. Have one assembly language module for each target platform. Let build process choose the modules. — Thomas Matthews, Apr 08 '10 at 19:47
Related: https://codegolf.stackexchange.com/questions/36213/reverse-bit-order-of-32-bit-integers — M.M, Jun 16 '20 at 10:10
Fun fact: ARM / AArch64 have an instruction for this, `rbit`. I don't know of a standard way to use it, though, not even a GCC or clang intrinsic. See [How can I elegantly take advantage of ARM instructions like REV and RBIT when writing C code?](https://stackoverflow.com/q/75056099) in case future answers have a better way. — Peter Cordes, Jan 10 '23 at 08:24

score 292 · Answer 1 · edited Oct 20 '20 at 04:10

292

This should work:

unsigned char reverse(unsigned char b) {
   b = (b & 0xF0) >> 4 | (b & 0x0F) << 4;
   b = (b & 0xCC) >> 2 | (b & 0x33) << 2;
   b = (b & 0xAA) >> 1 | (b & 0x55) << 1;
   return b;
}

First the left four bits are swapped with the right four bits. Then all adjacent pairs are swapped and then all adjacent single bits. This results in a reversed order.

edited Oct 20 '20 at 04:10

Ry-

218,210
55
464
476

answered Apr 08 '10 at 19:40

sth

222,467
53
283
367

43

Reasonably short and quick, but not simple. – Mark Ransom Apr 08 '10 at 19:58
4

This approach also cleanly generalizes to perform byte swapping for endianness. – Boojum Apr 08 '10 at 20:25
3

Not the simplest approach, but I like it +1. – nathan Apr 09 '10 at 14:10
18

Yes, it is simple. It's a kind of divide and conquer algorithm. Excellent! – kiewic May 11 '10 at 04:25
Is it faster than the method suggested by @Arkku below? – qed Nov 01 '13 at 16:30
This solution is excellent! I come up with swapping, but not so talent. – justin.yqyang Jun 16 '16 at 07:49
1

Clever solution, but you should point out that this approach works only on targets where `CHAR_BIT` is an exact power of 2. – Toby Speight May 31 '17 at 16:26
I wonder if a compiler would be able to recognize what that does and replace it with single assembler instruction if such is available. – Piotr Siupa Mar 12 '19 at 19:10
This beautifully scales to word sizes of any power of two bits (i.e. 16-bit words, 32-bit words, 64-bit words, etc.), with only one additional b= line for each higher power of 2. – Kevin H. Patterson Sep 14 '21 at 00:48
@KevinH.Patterson ...and also need to change the size of the hex literals (masks) according to the target word size. – Alex Che Nov 26 '21 at 10:06
Or you could simply have a table of answers and you result would be a simple table lookup. – Jesper Juhl May 30 '23 at 01:32

score 154 · Answer 2 · edited Mar 21 '15 at 18:35

154

I think a lookup table has to be one of the simplest methods. However, you don't need a full lookup table.

//Index 1==0b0001 => 0b1000
//Index 7==0b0111 => 0b1110
//etc
static unsigned char lookup[16] = {
0x0, 0x8, 0x4, 0xc, 0x2, 0xa, 0x6, 0xe,
0x1, 0x9, 0x5, 0xd, 0x3, 0xb, 0x7, 0xf, };

uint8_t reverse(uint8_t n) {
   // Reverse the top and bottom nibble then swap them.
   return (lookup[n&0b1111] << 4) | lookup[n>>4];
}

// Detailed breakdown of the math
//  + lookup reverse of bottom nibble
//  |       + grab bottom nibble
//  |       |        + move bottom result into top nibble
//  |       |        |     + combine the bottom and top results 
//  |       |        |     | + lookup reverse of top nibble
//  |       |        |     | |       + grab top nibble
//  V       V        V     V V       V
// (lookup[n&0b1111] << 4) | lookup[n>>4]

This fairly simple to code and verify visually.
Ultimately this might even be faster than a full table. The bit arith is cheap and the table easily fits on a cache line.

edited Mar 21 '15 at 18:35

Gordon Bailey

3,881
20
28

answered Apr 08 '10 at 20:33

deft_code

57,255
29
141
224

3

Nice, but will give you a cache miss. – Johan Kotlinski Nov 03 '10 at 17:42
8

@kotlinski: what will cause a cache miss? I think the small table version may be more cache efficient than the large one. On my Core2 a cache line is 64 bytes wide, the full table would span multiple lines, whereas the smaller table easily fits one a single line. – deft_code Oct 01 '11 at 14:20
@deft_code: Reading from the table would likely cause a cache miss, as code and global data are not likely to be that close to each other in memory. – Johan Kotlinski Oct 01 '11 at 15:42
3

@kotlinski, I suppose that could be a problem, the table is small, but if it's called infrequently it'll have to cache load the table every time. An alternative implementation would be to use a switch to encode the table instead an array. Either way I don't think it will measurably effect performance. – deft_code Oct 01 '11 at 18:31
4

@kotlinski: _Temporal_ locality is more important for cache hits or replacement strategies, than _address_ locality – cfi Sep 25 '13 at 16:39
What is the reasoning behind this strategy? Say I heard this question for the first time, how could I ever arrive at this curtailed version of lookup table? – Harshdeep Jan 02 '14 at 19:43
7

@Harshdeep: Consider the binary encoded indexes of the table entries. index b0000(0) -> b0000(0x0) boring; `b0001(1) -> b1000(0x8)`, `b0010(2) -> b0100(0x4)`, `b1010(10) -> b0101(0x5)`. See the pattern? It is simple enough that you can calculate it in your head (if you can read binary, otherwise you'll need paper to work it out). As for the leap that reversing an 8 bit integer is the same as reversing 4 bit parts then swapping them; I claim experience and intuition (or magic). – deft_code Jan 02 '14 at 20:48
@deft_code and your magic works :) thanks for the elaborate explanation. – Harshdeep Jan 06 '14 at 11:58
1

this is more suited to embedded applications since there's no cache and memory is limited – phuclv May 07 '14 at 12:47
2

Agreed @LưuVĩnhPhúc, this may be the solution with the **smallest footprint**. – Bob Stein Jun 18 '14 at 00:55
3

The lookup table could use a `const` ;) – psusi Jul 18 '17 at 17:05
How do you (1) generate this table (like [this](https://github.com/mikolalysenko/bit-twiddle/blob/0c4c32db753108c5ec7e3779d35eb103be8e7ee1/twiddle.js#L115-L135)), and (2) how do you apply this same table-based approach to a [16-bit unsigned integer?](https://stackoverflow.com/questions/71934050/how-to-optimally-reverse-bits-using-the-table-approach-for-8-bit-16-bit-and-32) – Lance Apr 20 '22 at 04:40

score 106 · Accepted Answer · answered Apr 08 '10 at 19:34

106

If you are talking about a single byte, a table-lookup is probably the best bet, unless for some reason you don't have 256 bytes available.

answered Apr 08 '10 at 19:34

e.James

116,942
41
177
214

15

If we're talking about something that's simple to implement without copying a ready-made solution, creating the lookup table does still require another solution. (Of course one might do it by hand, but that's error-prone and time-consuming…) – Arkku Apr 08 '10 at 19:52
7

You can squeeze the array into somewhat fewer than 256 bytes if you ignore palindromes. – wilhelmtell Apr 08 '10 at 19:56
1

@Arkku then write a script :p it's less time consuming than writing a solution in C++ which solves the problem at runtime. – wilhelmtell Apr 08 '10 at 20:00
8

@wilhelmtell - you'd need a table to know which ones are the palindromes. – Mark Ransom Apr 08 '10 at 20:01
6

@wilhelmtell: Well, to write the script one *still* needs another solution, which was my point – a lookup table is simple to use but not simple to create. (Except by copying a ready-made lookup table, but then one might just as well copy any solution.) For example, if the “simplest” solution is considered one that could be written on paper in an exam or interview, I would not start making a lookup table by hand and making the program to do it would already include a different solution (which would be simpler alone than the one including both it and the table). – Arkku Apr 08 '10 at 20:18
1

Fast *and* easy to implement. – Stephen Canon Apr 08 '10 at 21:18
4

@Arkku what I meant is write a script which outputs the table of the first 256 bytes and their reverse mapping. Yes, you're back to writing the reverse function, but now in your favourite scripting language, and it can be as nasty as you want -- you're going to throw it away as soon as it's done and you ran it once. Have the script's output as C code, even: `unsigned int rtable[] = {0x800, 0x4000, ...};`. Then throw away the script and forget you ever had it. It's much faster to write than the equivalent C++ code, and it will only ever run once, so you get O(1) runtime in your C++ code. – wilhelmtell Apr 08 '10 at 21:32
1

I would agree with wilhelmtell, here. You can even create the table in Excel. I work with embedded software, so I use Excel to create such tables all the time. Worst case, you leave a really ugly script running overnight, and then copy/paste your freshly minted table in the morning. – e.James Apr 08 '10 at 22:06
1

@e.James: Then I would argue that the solution is in Excel or the scripting language and not in C or C++ as per the question. And if the table was generated in C, then the solution used to generate it would obviously be simpler than the solution that includes both the generator and the look-up. That's all I'm saying… – Arkku Apr 08 '10 at 22:10
@Arkku: I think I see where you are coming from. In fact, without knowing exactly what the OP meant when he asked for the "simplest implementation", I fear we could debate this point for days. I stand by my own answer, but I have upvoted yours (and several of the others) since they all provide working solutions. I'll leave it up to the OP to decide which is simplest by way of the green checkmark :) – e.James Apr 08 '10 at 23:52
I like it. Simple to implement. Thanks. – nathan Apr 09 '10 at 14:09
You can halve the size of the lookup table with a single if and a couple of subtractions, e.g. if(n < 128) { return lookup[n]; } else { return 255 - lookup[255-n]; } // assumes unsigned bytes – Grant Peters Apr 09 '10 at 17:24
@Grant Peters: You can cut it even smaller (down to 16 bytes) with Caspin's answer. – e.James Apr 09 '10 at 17:58
@e.James, didn't see that and his would probably run faster as well – Grant Peters Apr 11 '10 at 07:51
@e.James combine it with `htonl` and works for bigger numbers too. – Ajay Brahmakshatriya Aug 30 '17 at 19:48

score 50 · Answer 4 · answered Apr 09 '10 at 09:27

Since nobody posted a complete table lookup solution, here is mine:

unsigned char reverse_byte(unsigned char x)
{
    static const unsigned char table[] = {
        0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0,
        0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0,
        0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8,
        0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8,
        0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4,
        0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4,
        0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec,
        0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc,
        0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2,
        0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2,
        0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea,
        0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
        0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6,
        0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6,
        0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee,
        0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe,
        0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1,
        0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1,
        0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9,
        0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9,
        0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5,
        0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5,
        0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed,
        0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd,
        0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3,
        0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3,
        0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb,
        0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb,
        0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7,
        0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7,
        0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef,
        0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff,
    };
    return table[x];
}

Useful, thanks. Seems that my slower shifting method was limiting performance in an embedded app. Placed table in ROM on a PIC (with addition of rom keyword). — flend, Apr 23 '12 at 10:02
A simpler method: http://graphics.stanford.edu/~seander/bithacks.html#BitReverseTable — sleepsort, Jun 14 '14 at 01:30
How did you get this table? What is the algorithm to generate this table? — Lance, Apr 20 '22 at 04:42
@Lance Just count from 0 to 255 and reverse each byte by any known method. For example, isolate each bit, move it into its destination, and then combine the moved bits back together. That's a classic bitwise-and, bit-shift, bitwise-or application. — fredoverflow, Apr 24 '22 at 09:29

Arkku · Answer 5 · 2014-02-27T01:33:38.683

47

See the bit twiddling hacks for many solutions. Copypasting from there is obviously simple to implement. =)

For example (on a 32-bit CPU):

uint8_t b = byte_to_reverse;
b = ((b * 0x0802LU & 0x22110LU) | (b * 0x8020LU & 0x88440LU)) * 0x10101LU >> 16;

If by “simple to implement” one means something that can be done without a reference in an exam or job interview, then the safest bet is probably the inefficient copying of bits one by one into another variable in reverse order (already shown in other answers).

edited Feb 27 '14 at 01:33

answered Apr 08 '10 at 19:37

Arkku

41,011
10
62
84

1

From your URL: 32 bit CPU: b = ((b * 0x0802LU & 0x22110LU) | (b * 0x8020LU & 0x88440LU)) * 0x10101LU >> 16; – Joshua Apr 08 '10 at 20:13
1

@Joshua: That's my personal favourite as well. The caveat (as stated on the linked page) is that it needs to be assigned or cast into an uint8_t or there will be garbage in the upper bits. – Arkku Apr 08 '10 at 20:20
The x86 instructions `tzcnt` and `bzhi` can be used to first compute `floor(log_2(x + 2))` and zero the bits above the MSB computed from `tzcnt` respectively, so a cast isn't too expensive in the one-off case. On AMD Ryzen Family 17h, tzcnt and bzhi are both one cycle. They are supported on both AMD and Intel platforms that have the x86 BMI extensions released in 2013 CPUs. – AMDG Jul 17 '21 at 01:27

score 28 · Answer 6 · edited May 31 '17 at 16:27

28

template <typename T>
T reverse(T n, size_t b = sizeof(T) * CHAR_BIT)
{
    assert(b <= std::numeric_limits<T>::digits);

    T rv = 0;

    for (size_t i = 0; i < b; ++i, n >>= 1) {
        rv = (rv << 1) | (n & 0x01);
    }

    return rv;
}

EDIT:

Converted it to a template with the optional bitcount

edited May 31 '17 at 16:27

Toby Speight

27,591
48
66
103

answered Apr 08 '10 at 19:38

andand

17,134
11
53
79

@nvl - fixed. I started building it as a template but decided halfway through to not do so... too many &gt &lt – andand Apr 08 '10 at 19:42
For extra pedenatry, replace `sizeof(T)*8` with `sizeof(T)*CHAR_BITS`. – Pillsy Apr 08 '10 at 20:26
@Pillsy - Sure, why not... it's not like anybody could actually be too pedantic, now could they? – andand Apr 09 '10 at 01:01
6

@andand For extra extra pendantry, replace `sizeof(T)*CHAR_BIT` by `std::numeric_limits::digits` (almost 4 years of pedantry later). – Morwenn Feb 25 '14 at 22:09
1

It should be `CHAR_BIT`, not `CHAR_BITS`. – Xunie Nov 08 '16 at 20:36
1

it should be rv = (rv << 1) | (n & 0x01); – Vignesh Mar 13 '17 at 23:20

Antonin GAVREL · Answer 7 · 2021-02-23T23:54:34.680

There are many ways to reverse bits depending on what you mean the "simplest way".

Reverse by Rotation

Probably the most logical, consists in rotating the byte while applying a mask on the first bit (n & 1):

unsigned char reverse_bits(unsigned char b)
{
    unsigned char   r = 0;
    unsigned        byte_len = 8;

    while (byte_len--) {
        r = (r << 1) | (b & 1);
        b >>= 1;
    }
    return r;
}

As the length of an unsigner char is 1 byte, which is equal to 8 bits, it means we will scan each bit while (byte_len--)
We first check if b as a bit on the extreme right with (b & 1); if so we set bit 1 on r with | and move it just 1 bit to the left by multiplying r by 2 with (r << 1)
Then we divide our unsigned char b by 2 with b >>=1 to erase the bit located at the extreme right of variable b. As a reminder, b >>= 1; is equivalent to b /= 2;

Reverse in One Line

This solution is attributed to Rich Schroeppel in the Programming Hacks section

unsigned char reverse_bits3(unsigned char b)
{
    return (b * 0x0202020202ULL & 0x010884422010ULL) % 0x3ff;
}

The multiply operation (b * 0x0202020202ULL) creates five separate copies of the 8-bit byte pattern to fan-out into a 64-bit value.
The AND operation (& 0x010884422010ULL) selects the bits that are in the correct (reversed) positions, relative to each 10-bit groups of bits.
Together the multiply and the AND operations copy the bits from the original byte so they each appear in only one of the 10-bit sets. The reversed positions of the bits from the original byte coincide with their relative positions within any 10-bit set.
The last step (% 0x3ff), which involves modulus division by 2^10 - 1 has the effect of merging together each set of 10 bits (from positions 0-9, 10-19, 20-29, ...) in the 64-bit value. They do not overlap, so the addition steps underlying the modulus division behave like OR operations.

Divide and Conquer Solution

unsigned char reverse(unsigned char b) {
   b = (b & 0xF0) >> 4 | (b & 0x0F) << 4;
   b = (b & 0xCC) >> 2 | (b & 0x33) << 2;
   b = (b & 0xAA) >> 1 | (b & 0x55) << 1;
   return b;
}

This is the most upvoted answer and despite some explanations, I think that for most people it feels difficult to visualize whats 0xF0, 0xCC, 0xAA, 0x0F, 0x33 and 0x55 truly means.

It does not take advantage of '0b' which is a GCC extension and is included since the C++14 standard, release in December 2014, so a while after this answer dating from April 2010

Integer constants can be written as binary constants, consisting of a sequence of ‘0’ and ‘1’ digits, prefixed by ‘0b’ or ‘0B’. This is particularly useful in environments that operate a lot on the bit level (like microcontrollers).

Please check below code snippets to remember and understand even better this solution where we move half by half:

unsigned char reverse(unsigned char b) {
   b = (b & 0b11110000) >> 4 | (b & 0b00001111) << 4;
   b = (b & 0b11001100) >> 2 | (b & 0b00110011) << 2;
   b = (b & 0b10101010) >> 1 | (b & 0b01010101) << 1;
   return b;
}

NB: The >> 4 is because there are 8 bits in 1 byte, which is an unsigned char so we want to take the other half, and so on.

We could easily apply this solution to 4 bytes with only two additional lines and following the same logic. Since both mask complement each other we can even use ~ in order to switch bits and saving some ink:

uint32_t reverse_integer_bits(uint32_t b) {
   uint32_t mask = 0b11111111111111110000000000000000;
   b = (b & mask) >> 16 | (b & ~mask) << 16;
   mask = 0b11111111000000001111111100000000;
   b = (b & mask) >> 8 | (b & ~mask) << 8;
   mask = 0b11110000111100001111000011110000;
   b = (b & mask) >> 4 | (b & ~mask) << 4;
   mask = 0b11001100110011001100110011001100;
   b = (b & mask) >> 2 | (b & ~mask) << 2;
   mask = 0b10101010101010101010101010101010;
   b = (b & mask) >> 1 | (b & ~mask) << 1;
   return b;
}

[C++ Only] Reverse Any Unsigned (Template)

The above logic can be summarized with a loop that would work on any type of unsigned:

template <class T>
T reverse_bits(T n) {
    short bits = sizeof(n) * 8; 
    T mask = ~T(0); // equivalent to uint32_t mask = 0b11111111111111111111111111111111;
    
    while (bits >>= 1) {
        mask ^= mask << (bits); // will convert mask to 0b00000000000000001111111111111111;
        n = (n & ~mask) >> bits | (n & mask) << bits; // divide and conquer
    }

    return n;
}

C++ 17 only

You may use a table that store the reverse value of each byte with (i * 0x0202020202ULL & 0x010884422010ULL) % 0x3ff, initialized through a lambda (you will need to compile it with g++ -std=c++1z since it only works since C++17), and then return the value in the table will give you the accordingly reversed bit:

#include <cstdint>
#include <array>

uint8_t reverse_bits(uint8_t n) {
        static constexpr array<uint8_t, 256> table{[]() constexpr{
                constexpr size_t SIZE = 256;
                array<uint8_t, SIZE> result{};

                for (size_t i = 0; i < SIZE; ++i)
                    result[i] = (i * 0x0202020202ULL & 0x010884422010ULL) % 0x3ff;
                return result;
        }()};

    return table[n];
}

main.cpp

Try it yourself with inclusion of above function:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

template <class T>
void print_binary(T n)
{   T mask = 1ULL << ((sizeof(n) * 8) - 1);  // will set the most significant bit
    for(; mask != 0; mask >>= 1) putchar('0' | !!(n & mask));
    putchar('\n');
}

int main() {
    uint32_t n = 12;
    print_binary(n);
    n = reverse_bits(n); 
    print_binary(n);
    unsigned char c = 'a';
    print_binary(c);
    c = reverse_bits(c);
    print_binary(c);
    uint16_t s = 12;
    print_binary(s);
    s = reverse_bits(s);
    print_binary(s);
    uint64_t l = 12;
    print_binary(l);
    l = reverse_bits(l);
    print_binary(l);
    return 0;
}

Reverse with asm volatile

Last but not least, if simplest means fewer lines, why not give a try to inline assembly?

You can test below code snippet by adding -masm=intel when compiling:

unsigned char reverse_bits(unsigned char c) {
    __asm__ __volatile__ (R"(
        mov cx, 8       
    daloop:                   
        ror di          
        adc ax, ax      
        dec cx          
        jnz short daloop  
    ;)");
}

Explanations line by line:

        mov cx, 8       ; we will reverse the 8 bits contained in one byte
    daloop:             ; while loop
        shr di          ; Shift Register `di` (containing value of the first argument of callee function) to the Right
        rcl ax          ; Rotate Carry Left: rotate ax left and add the carry from shr di, the carry is equal to 1 if one bit was "lost" from previous operation 
        dec cl          ; Decrement cx
        jnz short daloop; Jump if cx register is Not equal to Zero, else end loop and return value contained in ax register

Section [C++ Only] Reverse Any Unsigned (Template): Note that if the type template parameter T is signed, mask will represent a negative number and applying << on a negative signed integer yields undefined-behavior. Even if T is unsigned the result of the first shift operation on mask is implementation-defined — clickMe, Aug 25 '20 at 10:02

Daniel · Answer 8 · 2013-08-07T07:13:18.377

17

Two lines:

for(i=0;i<8;i++)
     reversed |= ((original>>i) & 0b1)<<(7-i);

or in case you have issues with the "0b1" part:

for(i=0;i<8;i++)
     reversed |= ((original>>i) & 1)<<(7-i);

"original" is the byte you want to reverse. "reversed" is the result, initialized to 0.

edited Aug 07 '13 at 07:13

answered Jan 28 '13 at 15:40

Daniel

763
1
8
25

Thomas Matthews · Answer 9 · 2010-04-09T16:59:05.987

15

Although probably not portable, I would use assembly language.
Many assembly languages have instructions to rotate a bit into the carry flag and to rotate the carry flag into the word (or byte).

The algorithm is:

for each bit in the data type:
  rotate bit into carry flag
  rotate carry flag into destination.
end-for

The high level language code for this is much more complicated, because C and C++ do not support rotating to carry and rotating from carry. The carry flag has to modeled.

Edit: Assembly language for example

;  Enter with value to reverse in R0.
;  Assume 8 bits per byte and byte is the native processor type.
   LODI, R2  8       ; Set up the bit counter
Loop:
   RRC, R0           ; Rotate R0 right into the carry bit.
   RLC, R1           ; Rotate R1 left, then append carry bit.
   DJNZ, R2  Loop    ; Decrement R2 and jump if non-zero to "loop"
   LODR, R0  R1      ; Move result into R0.

edited Apr 09 '10 at 16:59

answered Apr 08 '10 at 19:46

Thomas Matthews

56,849
17
98
154

7

I think this answer is the opposite of simple. Non-portable, assembly, and complex enough to be written in pseudo-code instead of the actual assembly. – deft_code Apr 08 '10 at 20:59
3

It is quite simple. I put it into pseudo-code because assembly mnemonics are specific to a breed of processor and there are a lot of breeds out there. If you would like, I can edit this to show the simple assembly language. – Thomas Matthews Apr 09 '10 at 16:53
One could see if a compiler optimization simplifies into a suitable assembly instruction. – Sparky Aug 07 '19 at 23:30

score 13 · Answer 10 · edited Jan 26 '18 at 06:34

13

I find the following solution simpler than the other bit fiddling algorithms I've seen in here.

unsigned char reverse_byte(char a)
{

  return ((a & 0x1)  << 7) | ((a & 0x2)  << 5) |
         ((a & 0x4)  << 3) | ((a & 0x8)  << 1) |
         ((a & 0x10) >> 1) | ((a & 0x20) >> 3) |
         ((a & 0x40) >> 5) | ((a & 0x80) >> 7);
}

It gets every bit in the byte, and shifts it accordingly, starting from the first to the last.

Explanation:

   ((a & 0x1) << 7) //get first bit on the right and shift it into the first left position 
 | ((a & 0x2) << 5) //add it to the second bit and shift it into the second left position
  //and so on

edited Jan 26 '18 at 06:34

sallaben

19
5

answered May 02 '15 at 12:57

dau_sama

4,247
2
23
30

Beautiful! My favorite so far. – Nick Rameau Nov 22 '16 at 08:15
This is certainly simple, but it should be pointed out that the execution time is O(n) rather than O(log₂ n), where n is the number of bits (8, 16, 32, 64, etc.). – Todd Lehman Apr 10 '20 at 21:31

sth · Answer 11 · 2018-06-22T18:42:41.310

10

The simplest way is probably to iterate over the bit positions in a loop:

unsigned char reverse(unsigned char c) {
   int shift;
   unsigned char result = 0;
   for (shift = 0; shift < CHAR_BIT; shift++) {
      if (c & (0x01 << shift))
         result |= (0x80 >> shift);
   }
   return result;
}

edited Jun 22 '18 at 18:42

answered Apr 09 '10 at 01:13

sth

222,467
53
283
367

it's CHAR_BIT, without an 's' – ljrk Jun 22 '18 at 12:25
Why use `CHAR_BIT` when you assume `char` to have 8 bits? – chqrlie May 04 '19 at 19:48

Bob Stein · Answer 12 · 2019-08-14T14:22:48.923

For the very limited case of constant, 8-bit input, this method costs no memory or CPU at run-time:

#define MSB2LSB(b) (((b)&1?128:0)|((b)&2?64:0)|((b)&4?32:0)|((b)&8?16:0)|((b)&16?8:0)|((b)&32?4:0)|((b)&64?2:0)|((b)&128?1:0))

I used this for ARINC-429 where the bit order (endianness) of the label is opposite the rest of the word. The label is often a constant, and conventionally in octal.

Here's how I used it to define a constant, because the spec defines this label as big-endian 205 octal.

#define LABEL_HF_COMM MSB2LSB(0205)

More examples:

assert(0b00000000 == MSB2LSB(0b00000000));
assert(0b10000000 == MSB2LSB(0b00000001));
assert(0b11000000 == MSB2LSB(0b00000011));
assert(0b11100000 == MSB2LSB(0b00000111));
assert(0b11110000 == MSB2LSB(0b00001111));
assert(0b11111000 == MSB2LSB(0b00011111));
assert(0b11111100 == MSB2LSB(0b00111111));
assert(0b11111110 == MSB2LSB(0b01111111));
assert(0b11111111 == MSB2LSB(0b11111111));
assert(0b10101010 == MSB2LSB(0b01010101));

baol · Answer 13 · 2010-04-08T20:29:16.123

6

You may be interested in std::vector<bool> (that is bit-packed) and std::bitset

It should be the simplest as requested.

#include <iostream>
#include <bitset>
using namespace std;
int main() {
  bitset<8> bs = 5;
  bitset<8> rev;
  for(int ii=0; ii!= bs.size(); ++ii)
    rev[bs.size()-ii-1] = bs[ii];
  cerr << bs << " " << rev << endl;
}

Other options may be faster.

EDIT: I owe you a solution using std::vector<bool>

#include <algorithm>
#include <iterator>
#include <iostream>
#include <vector>
using namespace std;
int main() {
  vector<bool> b{0,0,0,0,0,1,0,1};
  reverse(b.begin(), b.end());
  copy(b.begin(), b.end(), ostream_iterator<int>(cerr));
  cerr << endl;
}

The second example requires c++0x extension (to initialize the array with {...}). The advantage of using a bitset or a std::vector<bool> (or a boost::dynamic_bitset) is that you are not limited to bytes or words but can reverse an arbitrary number of bits.

HTH

edited Apr 08 '10 at 20:29

answered Apr 08 '10 at 19:37

baol

4,362
34
44

How is bitset any simpler than a pod here? Show the code, or it isn't. – wilhelmtell Apr 08 '10 at 19:53
Actu ally, I think that code will reverse the bitset, and then reverse it back to its original. Change **ii != size();** to **ii < size()/2;** and it'll do a better job =) – Viktor Sehr Apr 08 '10 at 19:58
(@viktor-sehr no, it will not, rev is different from bs). Anyway I don't like the answer myself: I think this is a case where binary arithmetic and shift operators are better suited. It still remains the simplest to understand. – baol Apr 08 '10 at 20:07
How about `std::vector b = { ... }; std::vector rb ( b.rbegin(), b.rend()); ` - using reverse iterators directly? – MSalters Aug 30 '17 at 21:19
@MSalters I like the immutability of it. – baol Sep 01 '17 at 18:06

Jamboree · Answer 14 · 2014-06-17T13:17:28.070

3

Can this be fast solution?

int byte_to_be_reversed = 
    ((byte_to_be_reversed>>7)&0x01)|((byte_to_be_reversed>>5)&0x02)|      
    ((byte_to_be_reversed>>3)&0x04)|((byte_to_be_reversed>>1)&0x08)| 
    ((byte_to_be_reversed<<7)&0x80)|((byte_to_be_reversed<<5)&0x40)|
    ((byte_to_be_reversed<<3)&0x20)|((byte_to_be_reversed<<1)&0x10);

Gets rid of the hustle of using a for loop! but experts please tell me if this is efficient and faster?

edited Jun 17 '14 at 13:17

answered May 06 '14 at 17:18

Jamboree

31
2

This has execution time is O(n) rather than O(log₂ n), where n is the number of bits (8, 16, 32, 64, etc.). See elsewhere for answers that execute in O(log₂ n) time. – Todd Lehman Apr 10 '20 at 21:39

score 3 · Answer 15 · answered Apr 08 '10 at 19:39

Table lookup or

uint8_t rev_byte(uint8_t x) {
    uint8_t y;
    uint8_t m = 1;
    while (m) {
       y >>= 1;
       if (m&x) {
          y |= 0x80;
       }
       m <<=1;
    }
    return y;
}

edit

Look here for other solutions that might work better for you

score 3 · Answer 16 · answered Apr 09 '10 at 07:03

a slower but simpler implementation:

static int swap_bit(unsigned char unit)
{
    /*
     * swap bit[7] and bit[0]
     */
    unit = (((((unit & 0x80) >> 7) ^ (unit & 0x01)) << 7) | (unit & 0x7f));
    unit = (((((unit & 0x80) >> 7) ^ (unit & 0x01))) | (unit & 0xfe));
    unit = (((((unit & 0x80) >> 7) ^ (unit & 0x01)) << 7) | (unit & 0x7f));

    /*
     * swap bit[6] and bit[1]
     */
    unit = (((((unit & 0x40) >> 5) ^ (unit & 0x02)) << 5) | (unit & 0xbf));
    unit = (((((unit & 0x40) >> 5) ^ (unit & 0x02))) | (unit & 0xfd));
    unit = (((((unit & 0x40) >> 5) ^ (unit & 0x02)) << 5) | (unit & 0xbf));

    /*
     * swap bit[5] and bit[2]
     */
    unit = (((((unit & 0x20) >> 3) ^ (unit & 0x04)) << 3) | (unit & 0xdf));
    unit = (((((unit & 0x20) >> 3) ^ (unit & 0x04))) | (unit & 0xfb));
    unit = (((((unit & 0x20) >> 3) ^ (unit & 0x04)) << 3) | (unit & 0xdf));

    /*
     * swap bit[4] and bit[3]
     */
    unit = (((((unit & 0x10) >> 1) ^ (unit & 0x08)) << 1) | (unit & 0xef));
    unit = (((((unit & 0x10) >> 1) ^ (unit & 0x08))) | (unit & 0xf7));
    unit = (((((unit & 0x10) >> 1) ^ (unit & 0x08)) << 1) | (unit & 0xef));

    return unit;
}

score 2 · Answer 17 · answered Apr 08 '10 at 21:03

Before implementing any algorithmic solution, check the assembly language for whatever CPU architecture you are using. Your architecture may include instructions which handle bitwise manipulations like this (and what could be simpler than a single assembly instruction?).

If such an instruction is not available, then I would suggest going with the lookup table route. You can write a script/program to generate the table for you, and the lookup operations would be faster than any of the bit-reversing algorithms here (at the cost of having to store the lookup table somewhere).

Notably, modern ARM and AArch64 have `rbit`. (And yes, if supported, it's efficient, not microcoded). x86 doesn't, although you can use SSSE3 `pshufb` as a lookup table for nibbles, bit-reversing up to 16 bytes in parallel. I forget if RISC-V extension-b (bit-manipulation) has bit-reverse. — Peter Cordes, Nov 28 '21 at 12:00

luci88filter · Answer 18 · 2018-02-17T08:44:32.437

2

This simple function uses a mask to test each bit in the input byte and transfer it into a shifting output:

char Reverse_Bits(char input)
{    
    char output = 0;

    for (unsigned char mask = 1; mask > 0; mask <<= 1)
    {
        output <<= 1;

        if (input & mask)
            output |= 1;
    }

    return output;
}

edited Feb 17 '18 at 08:44

answered Feb 17 '18 at 05:47

luci88filter

103
8

Mask should be unsigned sorry. – luci88filter Feb 17 '18 at 08:45

score 2 · Answer 19 · answered Nov 01 '19 at 08:32

2

Assuming that your compiler allows unsigned long long:

unsigned char reverse(unsigned char b) {
  return (b * 0x0202020202ULL & 0x010884422010ULL) % 1023;
}

Discovered here

answered Nov 01 '19 at 08:32

mascIT

413
4
16

score 1 · Answer 20 · edited May 23 '17 at 12:02

This one is based on the one BobStein-VisiBone provided

#define reverse_1byte(b)    ( ((uint8_t)b & 0b00000001) ? 0b10000000 : 0 ) | \
                            ( ((uint8_t)b & 0b00000010) ? 0b01000000 : 0 ) | \
                            ( ((uint8_t)b & 0b00000100) ? 0b00100000 : 0 ) | \
                            ( ((uint8_t)b & 0b00001000) ? 0b00010000 : 0 ) | \
                            ( ((uint8_t)b & 0b00010000) ? 0b00001000 : 0 ) | \
                            ( ((uint8_t)b & 0b00100000) ? 0b00000100 : 0 ) | \
                            ( ((uint8_t)b & 0b01000000) ? 0b00000010 : 0 ) | \
                            ( ((uint8_t)b & 0b10000000) ? 0b00000001 : 0 )

I really like this one a lot because the compiler automatically handle the work for you, thus require no further resources.

this can also be extended to 16-Bits...

#define reverse_2byte(b)    ( ((uint16_t)b & 0b0000000000000001) ? 0b1000000000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000000000010) ? 0b0100000000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000000000100) ? 0b0010000000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000000001000) ? 0b0001000000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000000010000) ? 0b0000100000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000000100000) ? 0b0000010000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000001000000) ? 0b0000001000000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000010000000) ? 0b0000000100000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000000100000000) ? 0b0000000010000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000001000000000) ? 0b0000000001000000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000010000000000) ? 0b0000000000100000 : 0 ) | \
                            ( ((uint16_t)b & 0b0000100000000000) ? 0b0000000000010000 : 0 ) | \
                            ( ((uint16_t)b & 0b0001000000000000) ? 0b0000000000001000 : 0 ) | \
                            ( ((uint16_t)b & 0b0010000000000000) ? 0b0000000000000100 : 0 ) | \
                            ( ((uint16_t)b & 0b0100000000000000) ? 0b0000000000000010 : 0 ) | \
                            ( ((uint16_t)b & 0b1000000000000000) ? 0b0000000000000001 : 0 )

I would put the `b` in parentheses in case it is a more complex expression than a single number, and perhaps also rename the macro to `REVERSE_BYTE` as a hint that you probably don't want to have a more complex (runtime) expression there. Or make it an inline function. (But overall I like this as being simple enough that you could do it from memory easily with very little chance of error.) — Arkku, Aug 08 '18 at 14:55

score 1 · Answer 21 · answered May 04 '19 at 19:51

Here is a simple and readable solution, portable to all conformant platforms, including those with sizeof(char) == sizeof(int):

#include <limits.h>

unsigned char reverse(unsigned char c) {
    int shift;
    unsigned char result = 0;

    for (shift = 0; shift < CHAR_BIT; shift++) {
        result <<= 1;
        result |= c & 1;
        c >>= 1;
    }
    return result;
}

Josko Marsic · Answer 22 · 2020-06-17T13:09:23.673

If you using small microcontroller and need high speed solution with small footprint, this could be solutions. It is possible to use it for C project, but you need to add this file as assembler file *.asm, to your C project. Instructions: In C project add this declaration:

extern uint8_t byte_mirror(uint8_t);

Call this function from C

byteOutput= byte_mirror(byteInput);

This is the code, it is only suitable for 8051 core. In the CPU register r0 is data from byteInput. Code rotate right r0 cross carry and then rotate carry left to r1. Repeat this procedure 8 times, for every bit. Then the register r1 is returned to c function as byteOutput. In 8051 core is only posibble to rotate acumulator a.

NAME     BYTE_MIRROR
RSEG     RCODE
PUBLIC   byte_mirror              //8051 core        

byte_mirror
    mov r3,#8;
loop:   
    mov a,r0;
    rrc a;
    mov r0,a;    
    mov a,r1;
    rlc a;   
    mov r1,a;
    djnz r3,loop
    mov r0,a
    ret

PROS: It is small footprint, it is high speed CONS: It is not reusable code, it is only for 8051

011101101->carry

101101110<-carry

While this code may answer the question, it would be better to include some context, explaining how it works and when to use it. Code-only answers are not useful in the long run. — elaforma, Jun 16 '20 at 10:52

score 1 · Answer 23 · answered Apr 25 '21 at 08:32

1

It is simple and fast:

unsigned char reverse(unsigned char rv)
{
unsigned char tmp=0;
if( rv&0x01 ) tmp = 0x80;
if( rv&0x02 ) tmp |= 0x40;
if( rv&0x04 ) tmp |= 0x20;
if( rv&0x08 ) tmp |= 0x10;
if( rv&0x10 ) tmp |= 0x08;
if( rv&0x20 ) tmp |= 0x04;
if( rv&0x40 ) tmp |= 0x02;
if( rv&0x80 ) tmp |= 0x01;
return tmp;
}

answered Apr 25 '21 at 08:32

Mehrdad Rahaaei

21
1

This is slower and less elegant than the answer that received the most up-votes. – zkoza Apr 25 '21 at 09:51
Really unfortunate. Your information seems to be superficial as you commented on the code I wrote. If you had a little experience with code execution time and how code is assembled, you would know that my code is faster. The fastest code to use is the Look Up Table (LUT). But the problem is the increase in code size. The next step is to use assembly language (shift right/left w/cry) but it depends on the machine and it will be harder and more specific to write. On the other hand using the loop will increase the execution time of the code, but the size of the code in memory will be small. – Mehrdad Rahaaei Apr 26 '21 at 11:28
If you want to participate in the discussion, first go and do some research on the assembly and check to see how many CPU clock cycle takes to execute for these 2 programs (You could compile through assembly). I just became a member, it will be interesting to see how much information/experience other members of the StackOverflow site have. – Mehrdad Rahaaei Apr 26 '21 at 11:30
Welcome to SO. If you think it was me who down-voted your answer, you're wrong. The lookup table is fast only if it is already in the cache, preferably L1. The solution I mention is easily extensible to entities longer than a byte. The execution time depends on many factors, including the target architecture, the state of instruction pipeline and the cache. In terms of the number of assembly instructions, here we go: https://godbolt.org/z/sadrhjn9a See also https://www.aldeid.com/wiki/X86-assembly/Instructions/rol and benchmark: https://www.quick-bench.com/q/Hs10lka2Xj1Y9SyQkgwSkeJ1M44 – zkoza Apr 26 '21 at 12:48
Mr. zikoz In your link when optimization is on my code has twice speed!! [link](https://www.quick-bench.com/q/7QMtOAS1hOnYcXt2WGgdwvoWe0Q) Wouldn't it be better if you tested the code first and then commented? – Mehrdad Rahaaei Apr 26 '21 at 13:45
1

My benchmark has had the optimization turned on. You only removed `benchmark::DoNotOptimize`, which is essential. Microbenchmarking is not an easy stuff and cannot be done in 5 minutes. Your answer would benefit *a lot* if you added to it an elaborated benchmark analysis. It is also instructive to look at the assembly generated by the benchmark to see the actual optimization level. Also, pls. move the `volatile` declaration outside the loop in both cases (I carelessly put one outside and one inside). Also, don't forget: https://stackoverflow.com/conduct – zkoza Apr 26 '21 at 15:08
@MehrdadRahaaei There is no reason to insult others. BTW, your reasoning is wrong. When you remove `benchmark::DoNotOptimize`, then the benchmark no longer makes sense. Zkoza's benchmark shows your solution is actually slower. It seems you are the one who does not want to admit this. I even created a benchmark on my machine and your solution took 1.43x longer time than zkoza's. – Daniel Langr May 05 '22 at 08:43

score 1 · Answer 24 · answered Sep 14 '21 at 04:07

This is a similar method to sth's excellent answer, but with optimizations, support for up to 64-bit integers, and other small improvements.

I utilize a C++ template function reverse_bits() to let the compiler optimize for various word sizes of integers which might be passed to the function. The function should work correctly with any word size that is a multiple of 8 bits, up to a maximum of 64 bits. If your compiler supports words longer than 64 bits, the method is straightforward to extend.

This a complete, ready-to-compile example with the requisite headers. There is a convenient template function to_binary_str() for creating a std::string representation of binary numbers, along with a few calls with various word sizes to demonstrate everything.

If you remove the comments and blank lines, the function is quite compact and visually pleasing.

You can try out it on labstack here.

// this is the only header used by the reverse_bits() function
#include <type_traits>

// these headers are only used by demonstration code
#include <string>
#include <iostream>
#include <cstdint>


template<typename T>
T reverse_bits( T n ) {
    // we force the passed-in type to its unsigned equivalent, because C++ may
    // perform arithmetic right shift instead of logical right shift, depending
    // on the compiler implementation.
    typedef typename std::make_unsigned<T>::type unsigned_T;
    unsigned_T v = (unsigned_T)n;

    // swap every bit with its neighbor
    v = ((v & 0xAAAAAAAAAAAAAAAA) >> 1)  | ((v & 0x5555555555555555) << 1);

    // swap every pair of bits
    v = ((v & 0xCCCCCCCCCCCCCCCC) >> 2)  | ((v & 0x3333333333333333) << 2);

    // swap every nybble
    v = ((v & 0xF0F0F0F0F0F0F0F0) >> 4)  | ((v & 0x0F0F0F0F0F0F0F0F) << 4);
    // bail out if we've covered the word size already
    if( sizeof(T) == 1 ) return v;

    // swap every byte
    v = ((v & 0xFF00FF00FF00FF00) >> 8)  | ((v & 0x00FF00FF00FF00FF) << 8);
    if( sizeof(T) == 2 ) return v;

    // etc...
    v = ((v & 0xFFFF0000FFFF0000) >> 16) | ((v & 0x0000FFFF0000FFFF) << 16);
    if( sizeof(T) <= 4 ) return v;

    v = ((v & 0xFFFFFFFF00000000) >> 32) | ((v & 0x00000000FFFFFFFF) << 32);

    // explictly cast back to the original type just to be pedantic
    return (T)v;
}


template<typename T>
std::string to_binary_str( T n ) {
    const unsigned int bit_count = sizeof(T)*8;
    char s[bit_count+1];
    typedef typename std::make_unsigned<T>::type unsigned_T;
    unsigned_T v = (unsigned_T)n;
    for( int i = bit_count - 1; i >= 0; --i ) {
        if( v & 1 )
            s[i] = '1';
        else
            s[i] = '0';

        v >>= 1;
    }
    s[bit_count] = 0;  // string null terminator
    return s;
}


int main() {
    {
        char x = 0xBA;
        std::cout << to_binary_str( x ) << std::endl;

        char y = reverse_bits( x );
        std::cout << to_binary_str( y ) << std::endl;
    }
    {
        short x = 0xAB94;
        std::cout << to_binary_str( x ) << std::endl;

        short y = reverse_bits( x );
        std::cout << to_binary_str( y ) << std::endl;
    }
    {
        uint64_t x = 0xFEDCBA9876543210;
        std::cout << to_binary_str( x ) << std::endl;

        uint64_t y = reverse_bits( x );
        std::cout << to_binary_str( y ) << std::endl;
    }
    return 0;
}

score 1 · Answer 25 · answered Oct 05 '21 at 15:33

With the help of various online resources, i jotted these for myself (not sure if they're 100% accurate) :

#                 octal       hex

# bit-orig    : 01234567    01234567:89ABCDEF
# bit-invert  : 76543210    FEDCBA98:76543210
#
# clz         : 32110000    43221111:00000000
# clo/ffs     : 00001123    00000000:11112234

bit-reverse : [ 0 4 2 6 1 5 3 7 ] [ 0 8 4 C 2 A 6 E 1 9 5 D 3 B 7 F ]

# cto         : 01020103    01020103:01020104
# ctz         : 30102010    40102010:30102010

but this is mostly only convenient if your input is already either hex or octal.

In both formats (8 or 16), you'll notice that after the bit-reflections, all the even number indices are all on the first half. I've also highlighted the same 0-7 on the hex side to help with the visualization of it.

in fact, one doesn't even have to do a double substring. The lookup string can be either used as seeking the letter needed, or simply use it as an index lookup. this is how i reflect the CRC32 polynomial myself :

(z is the input polynomial (or just any hex string)

xn = 0 ^ (x = length(z));          # initialize to numeric 0,
                                   # foo^bar in awk means 
                                   # foo-to-bar-th-power. 
                                   # same as foo**bar in other langs

 y = substr(_REF_bitREV_hex, 2);   # by pre-trimming the lookup str,
                                   # it allows skipping the + 1 at            
                                   # every cycle of the loop
 do { 
     xn *= 16
     xn += index(y, substr(z,x,1)) # keep in mind that this is awk syntax, 
                                   # where strings start at index-1, not zero.
 } while ( 1 < x—- );

One advantage of using a hex- or octal- based approach is that it allows for inputs of any length, enabling arbitrary precision operation without having to use a proper BigInteger or BigFloat library. For that, you'll have to substring out the new digit/letter and do string concats instead of simply adding each time.

score 1 · Answer 26 · answered Jan 29 '23 at 08:10

If there is one wondering how to get uint16_t or uint32_t bit reversed by using the solution from Rich Schroeppel in the Programming Hacks section

unsigned char reverse_bits3(unsigned char b)
{
    return (b * 0x0202020202ULL & 0x010884422010ULL) % 0x3ff;
}

It can be done by following divide-and-conquer code on X86/X64 platform (little-endian):

uint16_t reverse_bits3(uint16_t s)
{
    uint8_t &b0 = ((uint8_t*)&s)[0];
    uint8_t &b1 = ((uint8_t*)&s)[1];
    return (((uint16_t)reverse_bits3(b0))<<8) + reverse_bits3(b1);
}

uint32_t reverse_bits3(uint32_t u)
{
    uint16_t &s0 = ((uint16_t*)&u)[0];
    uint16_t &s1 = ((uint16_t*)&u)[1];
    return (((uint16_t)reverse_bits3(s0))<<16) + reverse_bits3(s1);
}

score 0 · Answer 27 · answered Feb 15 '19 at 12:39

I'll chip in my solution, since i can't find anything like this in the answers so far. It is a bit overengineered maybe, but it generates the lookup table using C++14 std::index_sequence in compile time.

#include <array>
#include <utility>

constexpr unsigned long reverse(uint8_t value) {
    uint8_t result = 0;
    for (std::size_t i = 0, j = 7; i < 8; ++i, --j) {
        result |= ((value & (1 << j)) >> j) << i;
    }
    return result;
}

template<size_t... I>
constexpr auto make_lookup_table(std::index_sequence<I...>)
{
    return std::array<uint8_t, sizeof...(I)>{reverse(I)...};   
}

template<typename Indices = std::make_index_sequence<256>>
constexpr auto bit_reverse_lookup_table()
{
    return make_lookup_table(Indices{});
}

constexpr auto lookup = bit_reverse_lookup_table();

int main(int argc)
{
    return lookup[argc];
}

https://godbolt.org/z/cSuWhF

score 0 · Answer 28 · answered Oct 05 '19 at 08:56

I know that this question is dated but I still think that the topic is relevant for some purposes, and here is a version that works very well and is readable. I can not say that it is the fastest or the most efficient, but it ought to be one of the cleanest. I have also included a helper function for easily displaying the bit patterns. This function uses some of the standard library functions instead of writing your own bit manipulator.

#include <algorithm>
#include <bitset>
#include <exception>
#include <iostream>
#include <limits>
#include <string>

// helper lambda function template
template<typename T>
auto getBits = [](T value) {
    return std::bitset<sizeof(T) * CHAR_BIT>{value};
};

// Function template to flip the bits
// This will work on integral types such as int, unsigned int,
// std::uint8_t, 16_t etc. I did not test this with floating
// point types. I chose to use the `bitset` here to convert
// from T to string as I find it easier to use than some of the
// string to type or type to string conversion functions,
// especially when the bitset has a function to return a string. 
template<typename T>
T reverseBits(T& value) {
    static constexpr std::uint16_t bit_count = sizeof(T) * CHAR_BIT;

    // Do not use the helper function in this function!
    auto bits = std::bitset<bit_count>{value};
    auto str = bits.to_string();
    std::reverse(str.begin(), str.end());
    bits = std::bitset<bit_count>(str);
    return static_cast<T>( bits.to_ullong() );
}

// main program
int main() {
    try {
        std::uint8_t value = 0xE0; // 1110 0000;
        std::cout << +value << '\n'; // don't forget to promote unsigned char
        // Here is where I use the helper function to display the bit pattern
        auto bits = getBits<std::uint8_t>(value);
        std::cout << bits.to_string() << '\n';

        value = reverseBits(value);
        std::cout << +value << '\n'; // + for integer promotion

        // using helper function again...
        bits = getBits<std::uint8_t>(value);
        std::cout << bits.to_string() << '\n';

    } catch(const std::exception& e) {  
        std::cerr << e.what();
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

And it gives the following output.

score 0 · Answer 29 · edited Feb 23 '20 at 17:02

0

This one helped me with 8x8 dot matrix set of arrays.

uint8_t mirror_bits(uint8_t var)
{
    uint8_t temp = 0;
    if ((var & 0x01))temp |= 0x80;
    if ((var & 0x02))temp |= 0x40;
    if ((var & 0x04))temp |= 0x20;
    if ((var & 0x08))temp |= 0x10;

    if ((var & 0x10))temp |= 0x08;
    if ((var & 0x20))temp |= 0x04;
    if ((var & 0x40))temp |= 0x02;
    if ((var & 0x80))temp |= 0x01;

    return temp;
}

edited Feb 23 '20 at 17:02

Community

1
1

answered Jan 13 '20 at 22:30

R1S8K

365
1
3
18

1

This function doesn't actually work, the reverse of 0b11001111 should be 0b11110011, but fails with this function. The same testing method works for many of the other functions listed here. – Dan Feb 22 '20 at 03:03
Yep, thanks I corrected my answer. Thanks for the letting me know about my mistake :) – R1S8K Feb 23 '20 at 17:03

RARE Kpop Manifesto · Answer 30 · 2022-07-27T02:30:57.283

regarding the bit-reflected lookup table for all 256 bytes, with just a few loops, you can generate it from scratch on the fly very quickly (the mapping from hex to bytes should be trivial) :

 #  gawk profile,created Tue Jul 26 22:22:18 2022

 #  BEGIN rule(s)
    BEGIN {
 1      print initREF()
    }
 # Functions,listed alphabetically
 1  function initREF(_,__,___,____,_____,______,_______)
    {
 1     ______=(_+=_^=_<_)^++_-(_=\
           __=(+(___=____="."))(_~_))

 1     gsub(___,"&\\&",_)
 1     _____[_<_]
       _____[ +_]

 7     do {
 7         gsub(___,_,__)
 7         ___=___""____
       } while (—______)

 1     gsub("....","=&", __)
 1     _+=_^=_<_;_______=__;

 2     for(______ in _____) { ______*=_*_*_
 4        for(____ in _____) {  ____*=_+_
 8           for(___ in _____) { ___*= +_
16              for(__ in _____) {

16                    gsub("=" (_<______)(_<____) (_~___)__,
                   sprintf("%X", __+___+____+______),_______)
        } } } }

 1      __=_______
 1       _="[^ ]+[ ]"
 1      gsub(".",_,_)

 1      gsub("..","0x&, ",__)
 1      gsub((_)_,  "&\n",__)
 1      sub("[\1-@]+$","",__)
 1            gsub(" ","",__)

 1      return __
    }

|

0x00,0x80,0x40,0xC0,0x20,0xA0,0x60,0xE0,0x10,0x90,0x50,0xD0,0x30,0xB0,0x70,0xF0,
0x08,0x88,0x48,0xC8,0x28,0xA8,0x68,0xE8,0x18,0x98,0x58,0xD8,0x38,0xB8,0x78,0xF8,
0x04,0x84,0x44,0xC4,0x24,0xA4,0x64,0xE4,0x14,0x94,0x54,0xD4,0x34,0xB4,0x74,0xF4,
0x0C,0x8C,0x4C,0xCC,0x2C,0xAC,0x6C,0xEC,0x1C,0x9C,0x5C,0xDC,0x3C,0xBC,0x7C,0xFC,
0x02,0x82,0x42,0xC2,0x22,0xA2,0x62,0xE2,0x12,0x92,0x52,0xD2,0x32,0xB2,0x72,0xF2,
0x0A,0x8A,0x4A,0xCA,0x2A,0xAA,0x6A,0xEA,0x1A,0x9A,0x5A,0xDA,0x3A,0xBA,0x7A,0xFA,
0x06,0x86,0x46,0xC6,0x26,0xA6,0x66,0xE6,0x16,0x96,0x56,0xD6,0x36,0xB6,0x76,0xF6,
0x0E,0x8E,0x4E,0xCE,0x2E,0xAE,0x6E,0xEE,0x1E,0x9E,0x5E,0xDE,0x3E,0xBE,0x7E,0xFE,
0x01,0x81,0x41,0xC1,0x21,0xA1,0x61,0xE1,0x11,0x91,0x51,0xD1,0x31,0xB1,0x71,0xF1,
0x09,0x89,0x49,0xC9,0x29,0xA9,0x69,0xE9,0x19,0x99,0x59,0xD9,0x39,0xB9,0x79,0xF9,
0x05,0x85,0x45,0xC5,0x25,0xA5,0x65,0xE5,0x15,0x95,0x55,0xD5,0x35,0xB5,0x75,0xF5,
0x0D,0x8D,0x4D,0xCD,0x2D,0xAD,0x6D,0xED,0x1D,0x9D,0x5D,0xDD,0x3D,0xBD,0x7D,0xFD,
0x03,0x83,0x43,0xC3,0x23,0xA3,0x63,0xE3,0x13,0x93,0x53,0xD3,0x33,0xB3,0x73,0xF3,
0x0B,0x8B,0x4B,0xCB,0x2B,0xAB,0x6B,0xEB,0x1B,0x9B,0x5B,0xDB,0x3B,0xBB,0x7B,0xFB,
0x07,0x87,0x47,0xC7,0x27,0xA7,0x67,0xE7,0x17,0x97,0x57,0xD7,0x37,0xB7,0x77,0xF7,
0x0F,0x8F,0x4F,0xCF,0x2F,0xAF,0x6F,0xEF,0x1F,0x9F,0x5F,0xDF,0x3F,0xBF,0x7F,0xFF

score 0 · Answer 31 · answered Sep 10 '22 at 21:24

This is the easiest approach to remember to me as a developer:

unsigned char   reverse_bits(unsigned char octet)
    {
        return  (((octet >> 0) & 1) << 7) | \
                (((octet >> 1) & 1) << 6) | \
                (((octet >> 2) & 1) << 5) | \
                (((octet >> 3) & 1) << 4) | \
                (((octet >> 4) & 1) << 3) | \
                (((octet >> 5) & 1) << 2) | \
                (((octet >> 6) & 1) << 1) | \
                (((octet >> 7) & 1) << 0);
    }

AntLet · Answer 32 · 2023-05-30T11:52:58.737

0

The easiest and fast way is with while

#define REVERSE_NUMBER(n, res) while (n) \
    { res = (res << 1) + (n % 2); n >>= 1; }

unsigned int res; //res is result
unsigned int n = 0x135F; // n = 1001101011111

REVERSE_NUMBER(n, res);  // res = 1111101011001 (1F59)

or with function

unsigned int reverse_number(unsigned int n) {
    unsigned int res;
    while (n) {
        res = (res << 1) + (n % 2);
        n >>= 1;
    }
    return res;
}

even easier way and effective

// shift << and >> equivalent to multiplication and division by 2
// and bitwise & 1 equivalent mod by 2
unsigned int reverse_number(unsigned int n) {
    unsigned int res = 0;
    size_t i = sizeof(unsigned int) * 8; // 32 bit value
    while (i--) {
        res = (res << 1) + (n & 1); n >>= 1;
    }
    return res;
}

edited May 30 '23 at 11:52

answered May 29 '23 at 21:16

AntLet

11
2

This leaves the leading zeroes alone, while the question seems to ask for reversing all bits of a byte. – HolyBlackCat May 29 '23 at 21:29
@HolyBlackCat but you can use counter bits `int i = 32; while(i--) { /*...*/ } ` instead of n. – AntLet May 29 '23 at 22:17
Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 30 '23 at 07:24
This question already has 41 other answers. What does this answer add that isn't covered in the others? This looks very similar to the fifth or so most upvoted answer. – Isaiah May 31 '23 at 03:33

score -1 · Answer 33 · edited Nov 14 '17 at 07:02

-1

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int i;
    unsigned char rev = 0x70 ; // 0b01110000
    unsigned char tmp = 0;

    for(i=0;i<8;i++)
    {
    tmp |= ( ((rev & (1<<i))?1:0) << (7-i));
    }
    rev = tmp;

    printf("%x", rev);       //0b00001110 binary value of given number
    return 0;
}

edited Nov 14 '17 at 07:02

Moorthi Muthu

409
5
13

answered Apr 05 '16 at 12:28

AHMED ANWAR

9

Please add some explanation. – zc246 Apr 05 '16 at 12:33

score -1 · Answer 34 · answered Feb 02 '17 at 05:14

#define BITS_SIZE 8

int
reverseBits ( int a )
{
  int rev = 0;
  int i;

  /* scans each bit of the input number*/
  for ( i = 0; i < BITS_SIZE - 1; i++ )
  {
    /* checks if the bit is 1 */
    if ( a & ( 1 << i ) )
    {
      /* shifts the bit 1, starting from the MSB to LSB
       * to build the reverse number 
      */
      rev |= 1 << ( BITS_SIZE - 1 ) - i;
    }
  }

  return rev;
}

asm_fan · Answer 35 · 2017-05-31T17:12:13.783

-1

  xor ax,ax
  xor bx,bx
  mov cx,8
  mov al,original_byte!
cycle:   shr al,1
  jnc not_inc
  inc bl
not_inc: test cx,cx
  jz,end_cycle
  shl bl,1
  loop cycle
end_cycle:

reversed byte will be at bl register

edited May 31 '17 at 17:12

answered May 31 '17 at 16:09

asm_fan

15
2

3

In an other context that may be a fair answer but the question was about C or C++, not asm ... – jadsq May 31 '17 at 16:35

score -1 · Answer 36 · answered Jun 23 '17 at 19:12

typedef struct
{
    uint8_t b0:1;
    uint8_t b1:1;
    uint8_t b2:1;
    uint8_t b3:1;
    uint8_t b4:1;
    uint8_t b5:1;
    uint8_t b6:1;
    uint8_t b7:1;
} bits_t;

uint8_t reverse_bits(uint8_t src)
{
    uint8_t dst = 0x0;
    bits_t *src_bits = (bits_t *)&src;
    bits_t *dst_bits = (bits_t *)&dst;

    dst_bits->b0 = src_bits->b7;
    dst_bits->b1 = src_bits->b6;
    dst_bits->b2 = src_bits->b5;
    dst_bits->b3 = src_bits->b4;
    dst_bits->b4 = src_bits->b3;
    dst_bits->b5 = src_bits->b2;
    dst_bits->b6 = src_bits->b1;
    dst_bits->b7 = src_bits->b0;

    return dst;
}

As a stylistic note, I find the use of `uint8_t` for the 1-bit fields a bit ugly, since it seems to first say that it will take 8 bits but then at the end of the line defines it as only a single bit. I would use `unsigned b0:1` etc. — Arkku, Aug 08 '18 at 14:44

agbinfo · Answer 37 · 2018-10-15T14:53:24.150

-1

I think this is simple enough

uint8_t reverse(uint8_t a)
{
  unsigned w = ((a << 7) & 0x0880) | ((a << 5) & 0x0440) | ((a << 3) & 0x0220) | ((a << 1) & 0x0110);
  return static_cast<uint8_t>(w | (w>>8));
}

or

uint8_t reverse(uint8_t a)
{
  unsigned w = ((a & 0x11) << 7) | ((a & 0x22) << 5) | ((a & 0x44) << 3) | ((a & 0x88) << 1);
  return static_cast<uint8_t>(w | (w>>8));
}

edited Oct 15 '18 at 14:53

answered Oct 15 '18 at 04:28

agbinfo

793
5
17

Mahen · Answer 38 · 2018-11-11T22:32:30.517

-1

unsigned char c ; // the original
unsigned char u = // the reversed
c>>7&0b00000001 |
c<<7&0b10000000 |
c>>5&0b00000010 |
c<<5&0b01000000 |
c>>3&0b00000100 |
c<<3&0b00100000 |
c>>1&0b00001000 |
c<<1&0b00010000 ;

Explanation: exchanged bits as per the arrows below.
01234567
<------>
#<---->#
##<-->##
###<>###

edited Nov 11 '18 at 22:32

answered Nov 11 '18 at 22:11

Mahen

1
1

score -2 · Answer 39 · answered Dec 19 '18 at 09:25

#include <stdio.h>
#include <stdlib.h>

#define BIT0 (0x01)
#define BIT1 (0x02)
#define BIT2 (0x04)
#define BIT3 (0x08)
#define BIT4 (0x10)
#define BIT5 (0x20)
#define BIT6 (0x40)
#define BIT7 (0x80)

#define BYTE_TO_BINARY_PATTERN "%c%c%c%c%c%c%c%c\n"

#define BITETOBINARY(byte) \
(byte & BIT7 ? '1' : '0'), \
(byte & BIT6 ? '1' : '0'), \
(byte & BIT5 ? '1' : '0'), \
(byte & BIT4 ? '1' : '0'), \
(byte & BIT3 ? '1' : '0'), \
(byte & BIT2 ? '1' : '0'), \
(byte & BIT1 ? '1' : '0'), \
(byte & BIT0 ? '1' : '0') \

#define BITETOBINARYREVERSE(byte) \
(byte & BIT0 ? '1' : '0'), \
(byte & BIT1 ? '1' : '0'), \
(byte & BIT2 ? '1' : '0'), \
(byte & BIT3 ? '1' : '0'), \
(byte & BIT4 ? '1' : '0'), \
(byte & BIT5 ? '1' : '0'), \
(byte & BIT6 ? '1' : '0'), \
(byte & BIT7 ? '1' : '0') \



int main()
{

    int i,j,c;

    i |= BIT2|BIT7;

    printf("0x%02X\n",i);    

    printf(BYTE_TO_BINARY_PATTERN,BITETOBINARY(i));

    printf("Reverse");

    printf(BYTE_TO_BINARY_PATTERN,BITETOBINARYREVERSE(i));

   return 0;
}

this just prints the string in reverse and has nothing with reversing the bits in a byte — phuclv, Apr 14 '20 at 00:41

score -3 · Answer 40 · edited May 08 '19 at 14:10

-3

This is an old question, but nobody seems to have shown the clear easy way (the closest was edW). I used C# to test this, but there's nothing in this example that couldn't be done easily in C.

void PrintBinary(string prompt, int num, int pad = 8)
{
    Debug.WriteLine($"{prompt}: {Convert.ToString(num, 2).PadLeft(pad, '0')}");
}

int ReverseBits(int num)
{
    int result = 0;
    int saveBits = num;
    for (int i = 1; i <= 8; i++)
    {
        // Move the result one bit to the left
        result = result << 1;

        //PrintBinary("saveBits", saveBits);

        // Extract the right-most bit
        var nextBit = saveBits & 1;

        //PrintBinary("nextBit", nextBit, 1);

        // Add our extracted bit to the result
        result = result | nextBit;

        //PrintBinary("result", result);

        // We're done with that bit, rotate it off the right
        saveBits = saveBits >> 1;

        //Debug.WriteLine("");
    }

    return result;
}

void PrintTest(int nextNumber)
{
    var result = ReverseBits(nextNumber);

    Debug.WriteLine("---------------------------------------");
    PrintBinary("Original", nextNumber);
    PrintBinary("Reverse", result);
}

void Main()
{
    // Calculate the reverse for each number between 1 and 255
    for (int x = 250; x < 256; x++)
        PrintTest(x);
}

edited May 08 '19 at 14:10

Philipp Ludwig

3,758
3
30
48

answered Feb 01 '17 at 09:39

Scott Gartner

862
14
22

2

There are a couple of other answers (in C, no less) that implement the simple approach of iterating over the bits. I guess the point here was to advertise the online service to run C# code, but as an answer to this question this fails by being in a different language and including too much irrelevant stuff. (Even your test loop doesn't match the comment: the comment says "between 1 and 255" but your loop goes 250 to 255.) – Arkku Aug 08 '18 at 15:01
hi, thanks for the effort I don't know who's putting down votes here ! we are here to learn not to put down votes !! – R1S8K Mar 27 '21 at 15:27
@R1S8K You have clearly misunderstood the purpose of this website. It is _not_ a place for you to practice your skills, it's a place to create a repository of _high quality answers_. OP posting a useless answer just creates noise that _everyone_ has to filter through when searching for the answer to the original question. Those answers are marked by downvotes so they don't get in the way of helpful answers. – pipe Jul 20 '23 at 13:03
1

@pipe Absolutely, after getting 6 months ban several times I got the point. So the questions or the answers should be a very professional ones. Also I posted a very simple questions and after even simplifying it more, got more popular. – R1S8K Jul 27 '23 at 11:08

score -4 · Answer 41 · edited Sep 25 '13 at 16:31

-4

How about this one...

int value = 0xFACE;

value = ((0xFF & value << 8) | (val >> 8);

edited Sep 25 '13 at 16:31

Lodder

19,758
10
59
100

answered Sep 25 '13 at 16:11

Valdo

9
1
4

4

This reverses the bytes in a (16-bit) word, but doesn't alter the order of the bits within the byte, so not really a solution. – Eric Brown Sep 25 '13 at 16:38

score -7 · Answer 42 · answered Mar 07 '13 at 22:22

-7

How about just XOR the byte with 0xFF.

unsigned char reverse(unsigned char b) { b ^= 0xFF; return b; }

answered Mar 07 '13 at 22:22

JC9162

11
1

xor flips bits, not reverses them. – ZachB Jan 29 '20 at 03:50