3

I've come across this answer.

But I hit 2 issues, I'm not sure how to adapt it to pack 64 bits (see below), and I can't get my head around unpacking them.

Here is what I have for packing:

const int i = 1;
#define is_bigendian() ((*(char *) &i) == 0)

#define MAGIC (is_bigendian() ? 0x0102040810204080 : 0x8040201008040201)

inline uint8_t pack8b(bool *a) {
    uint64_t t = *((uint64_t *) a);
    return (MAGIC * t >> 56) & 0xFF;
}

uint32_t pack32b(bool *a) {
    return (pack8b(a + 0) << 24) | (pack8b(a + 8) << 16) |
           (pack8b(a + 16) << 8) | (pack8b(a + 24) << 0);
}

uint64_t pack64b(bool *a) {
    return ((uint64_t) pack32b(a) << 32) | pack32b(a + 32);
}

EDIT: Here is what I have so far that works like a charm, i'm open to suggestion for performance improvement:

uint64_t MAGIC() {
    static const int i = 1;
    // Takes of little / big endian
    bool bigEndian = ((*(char *) &i) == 0);
    return bigEndian ? 0x0102040810204080 : 0x8040201008040201;
}

uint8_t pack8b(bool *a) {
    uint64_t t = *((uint64_t *) a);
    return (MAGIC() * t >> 56) & 0xFF;
}

uint16_t pack16b(bool *a) {
    return ((uint16_t) pack8b(a + 0) << 8) | (pack8b(a + 8) << 0);
}

uint32_t pack32b(bool *a) {
    return ((uint32_t) pack16b(a + 0) << 16) | (pack16b(a + 16) << 0);
}

uint64_t pack64b(bool *a) {
    return ((uint64_t) pack32b(a + 0) << 32) | (pack32b(a + 32) << 0);
}

void unpack8b(bool *a, uint8_t v) {
    uint64_t mask = 0x8080808080808080ULL;
    *((uint64_t *) a) = ((MAGIC() * v) & mask) >> 7;
}

void unpack16b(bool *a, uint16_t v) {
    unpack8b(a + 0, v >> 8);
    unpack8b(a + 8, v >> 0);
}

void unpack32b(bool *a, uint32_t v) {
    unpack16b(a + 0, v >> 16);
    unpack16b(a + 16, v >> 0);
}

void unpack64b(bool *a, uint64_t v) {
    unpack32b(a + 0, v >> 32);
    unpack32b(a + 32, v >> 0);
}
Raphaël Vigée
  • 2,048
  • 14
  • 27
  • whats wrong with the code you've implemented already? Are you sure you need special tricks to make your code faster? Have you implemented your whole program and profiled it to find out its too slow? – Alan Birtles Apr 03 '20 at 16:19
  • I think the packing works, but I can't check until I can unpack it – Raphaël Vigée Apr 03 '20 at 16:20
  • 1
    you could print the resulting number formated as hex to stdout and check if that matches your input array – Ackdari Apr 03 '20 at 16:21
  • The endianness support seems bogus: how do you number the bits in the 64-bit value? – chqrlie Apr 03 '20 at 19:52
  • 1
    You should use `uint8_t` instead of `bool`: an array of `bool` is not necessarily an array of `uint8_t` in C. – chqrlie Apr 03 '20 at 20:51
  • @chqrlieforyellowblockquotes what line a re you referring to ? – Raphaël Vigée Apr 03 '20 at 21:40
  • The array of type `bool` is not necessarily an array of bytes. In C++ it is, but in C it could be an array of `int`. Your make the assumption that this array must be an array of bytes. – chqrlie Apr 04 '20 at 08:10

2 Answers2

3

Your solution works for me I tested it with the gcc on tio: Try it online!

The simplest function for unpacking the uint64_t would be:

void unpack64b(uint64_t num, bool* bit_field) {
    for (int i = 63; 0 <= i; i--) {
        bit_field[i] = num & 1;
        num = num >> 1;
    }
}

If bit_field is initialized to all zeros then early exiting is possible:

void unpack64b(uint64_t num, bool* bit_field) {
    for (int i = 63; num > 0; i--) {
        bit_field[i] = num & 1;
        num = num >> 1;
    }
}

To the packing for 32 and 64 bit, shouldn't this also be aware of the endian encoding?

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Ackdari
  • 3,222
  • 1
  • 16
  • 33
  • if `bit_field` is uninitialized, it might still be more efficient to use the second version and add an extra loop to clear the remaining `i` bytes. It is non obvious that this early exit option be actually more efficient because the compiler will more likely unroll the first one and even produce vector code for it. – chqrlie Apr 03 '20 at 19:56
2

I can't get my head around unpacking them.

For starters, concentrate on just iterating the bits, do a simplest method that works. For me that was just simple array assignments with bit-and mask:

#include <stdbool.h>
#include <stdint.h>
#include <string.h>
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

static const int i = 1;
uint64_t MAGIC(void) {
    return ((*(const char *)&i) == 0) ? 0x0102040810204080ull : 0x8040201008040201ull;
}

void unpack8b(bool a[8], uint8_t v) {
    if (((*(const char *)&i) == 0)) {
        a[0] = v & 0x01;
        a[1] = v & 0x02;
        a[2] = v & 0x04;
        a[3] = v & 0x08;
        a[4] = v & 0x10;
        a[5] = v & 0x20;
        a[6] = v & 0x40;
        a[7] = v & 0x80;
    } else {
        a[7] = v & 0x01;
        a[6] = v & 0x02;
        a[5] = v & 0x04;
        a[4] = v & 0x08;
        a[3] = v & 0x10;
        a[2] = v & 0x20;
        a[1] = v & 0x40;
        a[0] = v & 0x80;
    }
}

void unpack16b(bool a[16], uint16_t v) {
    unpack8b(&a[0], v >> 8);
    unpack8b(&a[8], v);
}

void unpack32b(bool a[32], uint32_t v) {
    unpack16b(&a[0], v >> 16);
    unpack16b(&a[16], v);
}

void unpack64b(bool a[64], uint64_t v) {
    unpack32b(&a[0], v >> 32);
    unpack32b(&a[32], v);
}

uint8_t pack8b(bool a[8]) {
    static_assert(sizeof(bool) == 1, "");
    static_assert(sizeof(uint64_t) == 8, "");
    static_assert(CHAR_BIT == 8, "");
    uint64_t t;
    memcpy(&t, a, sizeof(t));
    return (MAGIC() * t >> 56) & 0xFF;
}

uint16_t pack16b(bool a[16]) {
    return pack8b(&a[0]) << 8 | pack8b(&a[8]);
}

uint32_t pack32b(bool a[32]) {
    return (uint32_t)pack16b(&a[0]) << 16 | pack16b(&a[16]);
}

uint64_t pack64b(bool a[64]) {
    return ((uint64_t)pack32b(&a[0]) << 32) | pack32b(&a[32]);
}

int main() {
    _Alignas(uint64_t) bool a[64];
    for (int i = 0; i < 64; ++i) {
        a[i] = rand() % 2;
    }

    for (int i = 0; i < 64; ++i) printf("%d%s", a[i], !((i+1)%8)?" ":"");
    printf("\n");

    uint64_t v = pack64b(a);
    printf("%llx\n", v);

    memset(a, -1, sizeof(a));
    unpack64b(a, v);

    for (int i = 0; i < 64; ++i) printf("%d%s", a[i], !((i+1)%8)?" ":"");
    printf("\n");
}

Outputs:

10111100 11010110 00001011 00011110 00111010 11110100 10101001 00011101 
bcd60b1e3af4a91d
10111100 11010110 00001011 00011110 00111010 11110100 10101001 00011101 

Note that when int on a platform has 16 bits, the (pack8b(a + 0) << 24) in your code is undefined behavior. Cast it to type with more bits then 24 (uint32_t)pack8b(a + 0) << 24 before shifting to be extra safe.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • That works, thank you. Now I face a second issue, making it fast... What are possibilities for doing so? – Raphaël Vigée Apr 03 '20 at 17:46
  • I advise, do not accept my answer yet, and let the question hold for some time (like week ~ two weeks), maybe someone with better understanding then me of bit twiddling hacks will come along. My understanding of bit twiddling hacks is pretty low.... Try adding more tags to the question, it will pop up more. – KamilCuk Apr 03 '20 at 19:23
  • I have updated my question with what I ended up doing – Raphaël Vigée Apr 03 '20 at 20:10
  • I really advise, do not use `*((uint64_t *) a)`. If your `bool` array will not be aligned to `uint64_t`, then you will get segmentation fault (and also you are braking strict aliasing and your compiler can (and will) mess up your code with optimizations). Use `memcpy`, it's always safe. Or, if you aim at speed, at least check if the pointer is aligned `assert((uintptr_t)b % _Alignof(uint64_t) == 0);` or something like that – KamilCuk Apr 03 '20 at 20:17
  • In my use case, the array will always be aligned to uint64, i will add an assert to ensure that – Raphaël Vigée Apr 03 '20 at 21:39