C variable smaller then 8-bit

Question

I'm writing C implementation of Conway's Game of Life and pretty much done with the code, but I'm wondering what is the most efficient way to storage the net in the program. The net is two dimensional and stores whether cell (x, y) is alive (1) or dead (0). Currently I'm doing it with unsigned char like that:

struct:

typedef struct {
    int rows;
    int cols;
    unsigned char *vec;
} net_t;

allocation:

n->vec = calloc( n->rows * n->cols, sizeof(unsigned char) );

filling:

i = ( n->cols * (x - 1) ) + (y - 1);
n->vec[i] = 1;

searching:

if( n->vec[i] == 1 )

but I don't really need 0-255 values - I only need 0 - 1, so I'm feeling that doing it like that is a waste of space, but as far as I know 8-bit char is the smallest type in C.

Is there any way to do it better?

Thanks!

8 bit chars are as small as you can get. You could use bitwise and `&` to store several values in one char. which may save you a bit of space. This may have a minor effect on the speed of your program, as you will need extra operations to get the data out. — James Snook, Apr 07 '14 at 16:52
As stated C doesn't have 1 bit variables so best way is bit shifting if you really need to conserve space. However if you are trying to save space I'd look at the easy to fix details first. Like changing the ints in the struct to something smaller. — Asthor, Apr 07 '14 at 16:59
I don't need to conserve memory, but I thought that id could be done better :) I'm thinking about Compressed Row Storage, but don't know how to implement it yet. — pawel.ad, Apr 07 '14 at 17:14
Compressing rows is great for saving and loading (and so you might want to look into this anyway), but in-game? Definitely a no. Think about it: every single state change you need to read, toggle, and write single bits, which has a *huge* impact on the compression state. The compressed size of each row will vary wildly. Even with a simple, fast algorithm such as RLE, you'll suffer a speed slowdown, a complexity speedup, and memory requirements all over the place. — Jongware, Apr 07 '14 at 17:33
@Jongware - So, no to bit operations and yes to CRS? Or no to CRS as well? — pawel.ad, Apr 07 '14 at 17:49
Yes, that's a no to CRS. Others have commented on pros and cons of using bits -- I can think of ways to get a max speed out of either, but live compression isn't on that list. On bits: you'll use 1/8th of the memory but loose a lot of speed (with any straightforward method, anyway). Ask yourself, what direction do you want to go in? — Jongware, Apr 07 '14 at 19:07

score 3 · Answer 1 · edited May 23 '17 at 12:05

3

The smallest declarable / addressable unit of memory you can address/use is a single byte, implemented as unsigned char in your case.

If you want to really save on space, you could make use of masking off individual bits in a character, or using bit fields via a union. The trade-off will be that your code will execute a bit slower, and will certainly be more complicated.

#include <stdio.h>
union both {
   struct { 
      unsigned char b0: 1;
      unsigned char b1: 1;
      unsigned char b2: 1;
      unsigned char b3: 1;
      unsigned char b4: 1;
      unsigned char b5: 1;
      unsigned char b6: 1;
      unsigned char b7: 1;
   } bits; 
   unsigned char byte;
};

int main ( ) {
   union both var;
   var.byte = 0xAA;
   if ( var.bits.b0 ) {
      printf("Yes\n");
   } else {
      printf("No\n");
   }
   return 0;
}

References

Union and Bit Fields, Accessed 2014-04-07, <http://www.rightcorner.com/code/CPP/Basic/union/sample.php>
Access Bits in a Char in C, Accessed 2014-04-07, <https://stackoverflow.com/questions/8584577/access-bits-in-a-char-in-c>
Struct - Bit Field, Accessed 2014-04-07, <http://cboard.cprogramming.com/c-programming/10029-struct-bit-fields.html>

edited May 23 '17 at 12:05

Community

1
1

answered Apr 07 '14 at 16:55

Cloud

18,753
15
79
153

1

@pawel.ad - bits do not have address, so you can not have a pointer to a bit. The smallest addressable unit (in most CPU architectures, and therefore) in C is a byte, hence why `char` is the smallest – Mike Apr 07 '14 at 17:21
1

I thought I remembered that you could declare multi-bit fields in a struct and access/assign the individual fields as integer/enum values. But it's been decades since I mucked with that sort of thing. – Hot Licks Apr 07 '14 at 17:40
@HotLicks You are correct, this can be done. It's also what I'm doing above. The advantage to declaring the `struct` within a `union` is that it allows you to safely address both individual bits and the entire byte directly. – Cloud Apr 07 '14 at 18:30
But you're only declaring single-bit fields. – Hot Licks Apr 07 '14 at 18:52
@HotLicks Regardless of whether I'm using single or multi-bit fields, or some mix of the two, using a `struct` alone means if you want all of the values, you have to address each element of the `struct`. With the `struct in a union` approach, you can use the `byte` member of the union to get all the values at once, which is not possible with a `struct` alone (you could backup a copy of the struct, but not get at its data directly, unless you did an explicit "union to int" cast, which I'm not sure is safe). – Cloud Apr 07 '14 at 19:21
I was just pointing out that one can conveniently fit, say, two 4-bit fields into a byte, and still address them (by name) as single entities, vs having to do mask/rotate gorp. – Hot Licks Apr 07 '14 at 19:24

score 1 · Answer 2 · answered Apr 07 '14 at 17:16

Unless you're working on an embedded platform, I wouldn't be too concerned about the size your net takes up by using an unsigned char to store only a 1 or 0.

To address your specific question: char is the smallest of the C data types. char, signed char, and unsigned char are all only going to take up 1 byte each.

If you want to make your code smaller you can use bitfields to decrees the amount of space you take up, but that will increase the complexity of your code.

For a simple exercise like this, I'd be more concerned about readability than size. One way you can make it more obvious what you're doing is switch to a bool instead of a char.

#include <stdbool.h>

typedef struct {
    int rows;
    int cols;
    bool *vec;
} net_t;

You can then use true and false which, IMO, will make your code much easier to read and understand when all you need is 1 and 0.

It will take up at least as much space as the way you're doing it now, but like I said, consider what's really important in the program you're writing for the platform you're writing it for... it's probably not the size.

score 0 · Answer 3 · answered Apr 07 '14 at 17:16

The smallest type on C as i know are the char (-128, 127), signed char (-128, 127), unsigned char (0, 255) types, all of them takes a whole byte, so if you are storing multiple bits values on different variables, you can instead use an unsigned char as a group of bits.

unsigned char lives = 128;

At this moment, lives have a 128 decimal value, which it's 10000000 in binary, so now you can use a bitwise operator to get a single value from this variable (like an array of bits)

if((lives >> 7) == 1) {

    //This code will run if the 8 bit from right to left (decimal 128) it's true
}

It's a little complex, but finally you'll end up with a bit array, so instead of using multiple variables to store single TRUE / FALSE values, you can use a single unsigned char variable to store 8 TRUE / FALSE values.

Note: As i have some time out of the C/C++ world, i'm not 100% sure that it's "lives >> 7", but it's with the '>' symbol, a little research on it and you'll be ready to go.

Brett Hale · Answer 4 · 2014-04-07T17:31:22.187

You're correct that a char is the smallest type - and it is typically (8) bits, though this is a minimum requirement. And sizeof(char) or (unsigned char) is (1). So, consider using an (unsigned) char to represent (8) columns.

How many char's are required per row? It's (cols / 8), but we have to round up for an integer value:

int byte_cols = (cols + 7) / 8;

or:

int byte_cols = (cols + 7) >> 3;

which you may wish to store with in the net_t data structure. Then:

calloc(n->rows * n->byte_cols, 1) is sufficient for a contiguous bit vector.

Address columns and rows by x and y respectively. Setting (x, y) (relative to 0) :

n->vec[y * byte_cols + (x >> 3)] |= (1 << (x & 0x7));

Clearing:

n->vec[y * byte_cols + (x >> 3)] &= ~(1 << (x & 0x7));

Searching:

if (n->vec[y * byte_cols + (x >> 3)] & (1 << (x & 0x7)))
    /* ... (x, y) is set... */
else
    /* ... (x, y) is clear... */

These are bit manipulation operations. And it's fundamentally important to learn how (and why) this works. Google the term for more resources. This uses an eighth of the memory of a char per cell, so I certainly wouldn't consider it premature optimization.

C variable smaller then 8-bit

4 Answers4