Extracting bits from uint8_t type array

Question

I have the data: ef324ad13255e219e8110044997cefaa43ff0954800000000000007 stored in an uint8_t type array called lfsr[36].

I want to extract specific bits from the array, e.g. bit no. 96, bit no. 184 etc.

How can I perform this operation?

What do you mean by extract? Do you want to check existance of those values, or want to replace them with 0 or something, or shift trailing characters? — t.m., Dec 12 '16 at 18:21

score 6 · Accepted Answer · edited May 23 '17 at 11:45

6

As noted by barak manos, the proper code is

(lfsr[bit / 8] >> (bit % 8)) & 1

To explain it:

bit / 8 chooses an element from your array. Each element contains 8 bits, so dividing by 8 is an easy way to convert a bit index to an element index.

bit % 8 chooses a bit inside the element. This is most straightforward choice of indexing; it counts bits from the least significant bit to most significant bit (little-endian). Another variant is

7 - bit % 8

This variant counts the bits in reverse order (big-endian). Sometimes you have to use it (e.g. in JPEG) for compatibility reasons; if you are free to decide which bit order to choose, use little-endian (because it's easier).

The syntax (... >> ...) & 1 extracts one bit from a number. See here for details.

edited May 23 '17 at 11:45

Community

1
1

answered Dec 12 '16 at 18:46

anatolyg

26,506
9
60
134

Also keep in mind the type used for indexing; signed modulo is very inefficient - either use unsigned index or >> and & bitwise operations. If the index is immediate then modulo is fine. https://godbolt.org/g/uxLk5g – t0rakka Dec 12 '16 at 19:13
@ikegami Ok but this answer say "it counts bits from the least significant bit to most significant bit (little-endian)" is wrong because bit don't have endianess. http://stackoverflow.com/questions/16803397/can-endianness-refer-to-bits-order-in-a-byte – Stargateur Dec 13 '16 at 01:36
@Stargateur The concept of endianness is rarely used for bits. If you want, you can even say that "endianness" never refers to bits, but some people will disagree. It's the same concept, so it's convenient to call it the same name. See also [here](http://stackoverflow.com/a/16804704/509868) and [here](https://en.wikipedia.org/wiki/Endianness#Bit_endianness). – anatolyg Dec 13 '16 at 09:14

ikegami · Answer 2 · 2016-12-12T20:02:51.813

The solution will take the following form:

( lfsr[byte_idx] >> bit_idx ) & 1

You didn't provide enough information to help us determine how to obtain the byte index and the bit index, though.

Are your indexes 0-based (A,C,E,G) or 1-based (B,D,F,H)?
Is the first bit in lfsr[0] (A,B,C,D) or in lfsr[35] (E,F,G,H)?
Are the bits numbered from the least-significant (C,D,G,H) or from the most-significant (A,B,E,F)?

All combinations of those are covered by the following chart:

                                   A     B     C     D     E     F     G     H
                         +---+                                                
 ( lfsr[ 0] >> 7 ) & 1   |   |     0     1     7     8   280   281   287   288
 ( lfsr[ 0] >> 6 ) & 1   |   |     1     2     6     7   281   282   286   287
 ( lfsr[ 0] >> 5 ) & 1   |   |     2     3     5     6   282   283   285   286
 ( lfsr[ 0] >> 4 ) & 1   |   |     3     4     4     5   283   284   284   285
 ( lfsr[ 0] >> 3 ) & 1   |   |     4     5     3     4   284   285   283   284
 ( lfsr[ 0] >> 2 ) & 1   |   |     5     6     2     3   285   286   282   283
 ( lfsr[ 0] >> 1 ) & 1   |   |     6     7     1     2   286   287   281   282
 ( lfsr[ 0] >> 0 ) & 1   |   |     7     8     0     1   287   288   280   281
                         +---+                                                
 ( lfsr[ 1] >> 7 ) & 1   |   |     8     9    15    16   272   273   279   280
 ( lfsr[ 1] >> 6 ) & 1   |   |     9    10    14    15   273   274   278   279
 ( lfsr[ 1] >> 5 ) & 1   |   |    10    11    13    14   274   275   277   278
 ( lfsr[ 1] >> 4 ) & 1   |   |    11    12    12    13   275   276   276   277
 ( lfsr[ 1] >> 3 ) & 1   |   |    12    13    11    12   276   277   275   276
 ( lfsr[ 1] >> 2 ) & 1   |   |    13    14    10    11   277   278   274   275
 ( lfsr[ 1] >> 1 ) & 1   |   |    14    15     9    10   278   279   273   274
 ( lfsr[ 1] >> 0 ) & 1   |   |    15    16     8     9   279   280   272   273
                         +---+                                                
                         | . |                                                
                           .                                                  
                         | . |                                                
                         +---+                                                
 ( lfsr[34] >> 7 ) & 1   |   |   272   273   279   280     8     9    15    16
 ( lfsr[34] >> 6 ) & 1   |   |   273   274   278   279     9    10    14    15
 ( lfsr[34] >> 5 ) & 1   |   |   274   275   277   278    10    11    13    14
 ( lfsr[34] >> 4 ) & 1   |   |   275   276   276   277    11    12    12    13
 ( lfsr[34] >> 3 ) & 1   |   |   276   277   275   276    12    13    11    12
 ( lfsr[34] >> 2 ) & 1   |   |   277   278   274   275    13    14    10    11
 ( lfsr[34] >> 1 ) & 1   |   |   278   279   273   274    14    15     9    10
 ( lfsr[34] >> 0 ) & 1   |   |   279   280   272   273    15    16     8     9
                         +---+                                                
 ( lfsr[35] >> 7 ) & 1   |   |   280   281   287   288     0     1     7     8
 ( lfsr[35] >> 6 ) & 1   |   |   281   282   286   287     1     2     6     7
 ( lfsr[35] >> 5 ) & 1   |   |   282   283   285   286     2     3     5     6
 ( lfsr[35] >> 4 ) & 1   |   |   283   284   284   285     3     4     4     5
 ( lfsr[35] >> 3 ) & 1   |   |   284   285   283   284     4     5     3     4
 ( lfsr[35] >> 2 ) & 1   |   |   285   286   282   283     5     6     2     3
 ( lfsr[35] >> 1 ) & 1   |   |   286   287   281   282     6     7     1     2
 ( lfsr[35] >> 0 ) & 1   |   |   287   288   280   281     7     8     0     1
                         +---+

Here's how to obtain the bit for each of the indexing methods:

A: int bit96 = ( lfsr[                   96    / 8       ] >> ( 7 - (  96    % 8 ) ) ) & 1;
B: int bit96 = ( lfsr[                  (96-1) / 8       ] >> ( 7 - ( (96-1) % 8 ) ) ) & 1;
C: int bit96 = ( lfsr[                   96    / 8       ] >> (        96    % 8   ) ) & 1;
D: int bit96 = ( lfsr[                  (96-1) / 8       ] >> (       (96-1) % 8   ) ) & 1;
E: int bit96 = ( lfsr[ sizeof(lfsr) - (  96    / 8 ) - 1 ] >> ( 7 - (  96    % 8 ) ) ) & 1;
F: int bit96 = ( lfsr[ sizeof(lfsr) - ( (96-1) / 8 ) - 1 ] >> ( 7 - ( (96-1) % 8 ) ) ) & 1;
G: int bit96 = ( lfsr[ sizeof(lfsr) - (  96    / 8 ) - 1 ] >> (        96    % 8   ) ) & 1;
H: int bit96 = ( lfsr[ sizeof(lfsr) - ( (96-1) / 8 ) - 1 ] >> (       (96-1) % 8   ) ) & 1;

G is most likely. A and B are the next most likely. E is extremely unlikely and F was only included for completeness.

there are not byte order when the value has only one bytes... char or uint8_t have only one bytes — Stargateur, Dec 12 '16 at 19:23
@Stargateur, What are you talking about? It's imposisble to place 184 bits in one byte. According to the OP, the value has 36 bytes. BE = bit 0 is in lfsr[35]. LE = bit 0 is in lfsr[0]. — ikegami, Dec 12 '16 at 19:29
Absolutely wrong. `uint8_t x[23];` would be an example of one. The OP has an even larger one, so it makes complete sense for them to ask to get bit 184. — ikegami, Dec 12 '16 at 22:31
Ah, you assume that his array is "one" variable... Yes in that way there are endianess indeed. But the OP doesn't say anything about that. — Stargateur, Dec 13 '16 at 01:33
I don't assume anything. A variable is an association between a name and a memory space, and that's what `x` is. Actually, the OP did say they had such a variable. `uint8_t lfsr[36]`, to be precise. — ikegami, Dec 13 '16 at 01:34
an array is separate variable in memory, endianess is about how memory stock an basic box. So unless you interpret an array like a big variable with byte there are no endianess in an array [Likewise, the order of the elements of a C array are unaffected by endianness.](http://stackoverflow.com/questions/26455843/how-are-array-values-stored-in-little-endian-vs-big-endian-architecture) — Stargateur, Dec 13 '16 at 01:40
Of course. And since that's exactly what the question asks us to do, it's perfectly fine here. — ikegami, Dec 13 '16 at 01:42

score 0 · Answer 3 · answered Dec 12 '16 at 18:26

0

You could try using a mask and bitwise AND's. For example, you could grab the LSB by doing something like 0x1 & (number to extract bit from). Grabbing the second would be 0x2, the third 0x4, etc.

answered Dec 12 '16 at 18:26

awerchniak

357
3
16

Stargateur · Answer 4 · 2016-12-12T19:09:01.113

-1

You can use the same array

#include <stddef.h>
#include <stdio.h>
#include <stdint.h>

#define MOST_LEFT_BIT_UINT8 (UINT8_MAX / 2 + 1)

int main(void) {
  uint8_t lfsr[56] = "ef324ad13255e219e8110044997cefaa43ff0954800000000000007";

  for (size_t i = 0; i < sizeof lfsr - 1; i++) {
    for (size_t j = 0; j < 8; j++) {
      if (lfsr[i] & (MOST_LEFT_BIT_UINT8 >> j))
        printf("1");
      else
        printf("0");
    }
    printf(" ");
  }
  printf("\n");
}

edited Dec 12 '16 at 19:09

answered Dec 12 '16 at 18:32

Stargateur

24,473
8
65
91

The question says "uint8_t type array called lfsr", so I guess that you should use `uint8_t` instead of `char`, and that you may as well use `8` instead of `CHAR_BIT`. By the way, I believe that the data is given as integral type, not as a null-terminated string. In any case, if it **IS** given as a null-terminated string, then you should use `lfsr[i]-'0'`, not `lfsr[i]`. – barak manos Dec 12 '16 at 18:36
BTW, you may also simplify the expression inside the `if` to `(lfsr[i] >> j) & 1`, which is much more readable IMO. – barak manos Dec 12 '16 at 18:39
@barakmanos two anwser are wrong with byte order, you welcome – Stargateur Dec 12 '16 at 19:24
As I have already mentioned in my first comment, as far as I understand the question, the input is **not** given as a null-terminated string of characters. But if you insist on assuming that it is, then you should **by the least** use `lfsr[i]-'0'` instead of `lfsr[i]`. – barak manos Dec 12 '16 at 19:32
@barakmanos The OP say 36 uint8_t and offer 56 character you assume that is hexa but 56 / 2 = 28. Of course I only assume that the OP is not clear so I take the simple. – Stargateur Dec 12 '16 at 21:08
One last time (not gonna repeat this again): if you take the input as a string of characters, then you should use `lfsr[i]-'0'` instead of `lfsr[i]`!!! – barak manos Dec 13 '16 at 04:36

Extracting bits from uint8_t type array

4 Answers4