7

I write a program with highly memory cost requirement and I want save memory with no performance lost. So I want change every variable which has only two situations into bit.

But I can't find bit type in C++ and bitset in STL is always multiples of 4 byte in 32-bit machine. Writing a data struct to manage bits will cause performance lost.

Is there any way to declare a bit value just like bit a;?

Thanks everyone. At last the answer I want is:"you can't buy half bytes in C++".

zzy
  • 1,771
  • 1
  • 13
  • 48
  • *"Writing a data struct to manage bits will cause performance lost."* How exactly? – Nawaz May 08 '14 at 06:35
  • std::bitset is your way to go. If you have 1000 bits (variables), they will be stored in ~125 bytes – kaspersky May 08 '14 at 06:42
  • @Nawaz Changing bit's value need find it at first.It will cost many time in so many many such operations. – zzy May 08 '14 at 06:45
  • @zzy: Have you measured the performance of your app? How are you so sure that *changing bit's value* is the bottleneck? – Nawaz May 08 '14 at 06:52
  • @Nawaz Actually,the performance lost doesn't matter me more than the memory cost.But the data should be construct is really large.To change the byte to bit is obviously the best way to save memory compare to change the complicated algorithm . – zzy May 08 '14 at 06:56
  • @zzy: then what is the problem? you seem to have confusing problem. Is it performance-problem or memory-problem? – Nawaz May 08 '14 at 06:57
  • "I want save memory with no performance lost" - CPUs generally lack functions to operate on bits at the same speed as words, so C++ can't do better. – Tony Delroy May 08 '14 at 06:59
  • @TonyD what's the words stand for? – zzy May 08 '14 at 07:09
  • 1
    @zzy, word is a data type, see http://www.makelinux.net/books/lkd2/ch19lev1sec2 – kaspersky May 08 '14 at 07:25

5 Answers5

7

There is none. The smallest addressable entity is a byte. This is the char or unsigned char type. (The best type is the integer because it is aligned to the width of your processor and thus fastest to fetch and work on)

To work with bits you need to use boolean operators and mask/shift your data in the larger types. Or work with STL bitsets.

meandbug
  • 202
  • 2
  • 5
  • So bitset will be the one cost smallest memory? – zzy May 08 '14 at 06:42
  • No, bitset is likely to use at least exactly as much as integer types. Read the accepted answer here http://stackoverflow.com/questions/12459563/what-is-the-size-of-bitset-in-c – meandbug May 08 '14 at 06:45
  • The smallest way is using bytes. (But due to memory alignment etc, it is unlikely to save data, unless you want to do network transfers of this data or write it to a file) what do you want to do with the data? – meandbug May 08 '14 at 06:49
  • I construct a very large tree to do some complicated algorithm using this data. The data is so large that even 16GB memory can't hold it. – zzy May 08 '14 at 06:52
  • 2
    @zzy, sorry, it seems to me you don't know understand how bitset works. If you have 1 variable (1 bit), it will use 32 bits. If you have 32 variables, it will use 32 bits. If you use 33 variables, it will use 64 bits. If you have a large number N of variables, it will use ~N/8 bits. – kaspersky May 08 '14 at 06:52
  • How do you navigate your tree? – meandbug May 08 '14 at 06:58
  • @gg.kaspersky,It's clear to me now,thanks.I haven't use bit before :).Now I will heading another way to solve my problem. – zzy May 08 '14 at 06:59
4

source: http://www.learncpp.com/cpp-tutorial/3-8a-bit-flags-and-bit-masks/

If you are using simple booleans, the above example displays how you can adress them to seperate bit values inside of bytes.

C++14 Define 8 separate bit flags (these can represent whatever you want)

const unsigned char option1 = 0b0000'0001;    
const unsigned char option2 = 0b0000'0010;    
const unsigned char option3 = 0b0000'0100;    
const unsigned char option4 = 0b0000'1000;    
const unsigned char option5 = 0b0001'0000;    
const unsigned char option6 = 0b0010'0000;    
const unsigned char option7 = 0b0100'0000;    
const unsigned char option8 = 0b1000'0000;

C++11 or earlier
Define 8 separate bit flags (these can represent whatever you want)

const unsigned char option1 = 0x1; // hex for 0000 0001     
const unsigned char option2 = 0x2; // hex for 0000 0010    
const unsigned char option3 = 0x4; // hex for 0000 0100   
const unsigned char option4 = 0x8; // hex for 0000 1000    
const unsigned char option5 = 0x10; // hex for 0001 0000    
const unsigned char option6 = 0x20; // hex for 0010 0000    
const unsigned char option7 = 0x40; // hex for 0100 0000    
const unsigned char option8 = 0x80; // hex for 1000 0000

We use a byte-size value to hold our options Each bit in myflags corresponds to one of the options defined above

unsigned char myflags = 0; -- all options turned off to start

To query a bit state, we use bitwise AND ('&' operator):

if (myflags & option4) ... -- if option4 is set, do something
if !(myflags & option5) ... -- if option5 is not set, do something

To set a bit (turn on), we use bitwise OR('|' operator):

myflags |= option4; -- turn option 4 on.
myflags |= (option4 | option5); -- turn options 4 and 5 on.

To clear a bit (turn off), we use bitwise AND with an inverse(~) bit pattern:

myflags &= ~option4; -- turn option 4 off
myflags &= ~(option4 | option5); -- turn options 4 and 5 off

To toggle a bit state, we use bitwise XOR:

myflags ^= option4; -- flip option4 from on to off, or vice versa
myflags ^= (option4 | option5); -- flip options 4 and 5

You can use: static_cast(value) to turn said value into a bool.

Sotem
  • 79
  • 7
2

There is no such data type as a "bit" specifically. The a practice is to use a standard uint8_t (or uint16, 32) and use the individual bits for different values. E.g.:

#define BIT1 0x01
#define BIT2 0x02
#define BIT3 0x04
#define BIT4 0x08

uint8_t bit_vars;

// Make a function to access a particular bit
uint8_t get_bitx(int x)
{
    switch (x)
    {
    case 1:
        return bit_vars & BIT1;
        break;
    case 2:
        return bit_vars & BIT2;
        break;
    case 3:
        return bit_vars & BIT3;
        break;
    case 4:
        return bit_vars & BIT4;
        break;
}

// Make a function to set/storea particular bit
void set_bitx(int x, bool set_flag)
{
    switch (x)
    {
    case 1:
        if (set_flag) {bit_vars |= 1 << (BIT1 - 1);}
        break;
    case 2:
        if (set_flag) {bit_vars |= 1 << (BIT2 - 1);}
        break;
    case 3:
        if (set_flag) {bit_vars |= 1 << (BIT3 - 1);}
        break;
    case 4:
        if (set_flag) {bit_vars |= 1 << (BIT4 - 1);}
        break;
}

Note: This is just a rough example, not compilable.

You can also use bit-fields, I personally tend to stay away from them, as they are not always portable across different processors / compilers.

code_fodder
  • 15,263
  • 17
  • 90
  • 167
  • 1
    Why no use std::bitset? – kaspersky May 08 '14 at 06:46
  • yeah, that's good also, plenty of options :), but none that are actually just 1 bit long :( – code_fodder May 08 '14 at 06:49
  • @code_fodder: can you support "bit fields are not always portable"? They should work in-process on any C++ system, though the memory layout/padding can vary, requiring proper serialisaton/deserialisation rather than a naive memory dump/load, but even integers suffers from endian-ness and every type has *potential* alignment/padding portability issues. – Tony Delroy May 08 '14 at 07:12
  • @TonyD You are right in that endianness can be an issue, but its usually pretty straight forward to deal with (you pack/unpack, or define up-front the order of the bits and extract them accordingly). But with bitfields, they are a strange item in that the implementation (as far as I remember) is almost completely up to the compiler. One compiler may pack into a single byte and another may use a word (2-bytes) for the same information. The alignment and ordering/location is also left to the compiler (one of those being endianness). So you have no clue how the data looks, there4 not portable :( – code_fodder May 08 '14 at 07:21
  • @core_fodder: you're right that layout is more complex, but it's often practical to `#ifdef` up a binary-layout-compatible cross-platform header for a few specific platforms, and failing that it's not that much harder for bitfields compared to handling endianness - e.g. you can "manually" extract the bitfield something ala `uint16_t bf = x.mybitfield;` then piggyback on whatever you're doing for `uint16_t` (or 8/32 etc) - typically using `ntoh`/`hton`. For tighter storage, you can bitshift/OR bitfields into predictable positions. Perhaps it's fairest to say "source portable / data not". – Tony Delroy May 08 '14 at 08:34
  • @TonyD mmm... so, it "may be possible to port the source", but you still need to know how the compiler is storing the data (which "could" change with a different version of the same compiler if they wanted), more then just endianness. For my money, its easier just to use the bits of standard "more easily portable" types :), but if you are a bitfield master, you may put your money into them :)... anyway, for c++ it looks like std::bitset probably better then both options! – code_fodder May 08 '14 at 09:26
  • @code_fodder: yeah - not unreasonable to consider avoiding bitfields if there's a need for persistence or cross process/system usage, though nothing in this question hints at that. Good concerns to have enumerated. Cheers. – Tony Delroy May 08 '14 at 13:40
1

You can use bit fields. Or use std::vector with bool type, which has template specialization.

  • 1
    +1 from me, do not understand the downvote. But [bitfields](http://en.cppreference.com/w/cpp/language/bit_field) seem the most appropriate here. – Bgie May 08 '14 at 06:42
  • It was not me who downvoted, but std::vector is not appropriate. It will use 8 times the needed memory. – kaspersky May 08 '14 at 06:45
  • @gg.kaspersky not me downvoted,but it really need more memory. – zzy May 08 '14 at 06:49
  • 1
    @gg.kaspersky vector stores 8 boolean values per byte, so it won't use 8 times the needed memory. (Not that that is necessarily a good thing: http://isocpp.org/blog/2012/11/on-vectorbool ) – Jeremy Friesner May 08 '14 at 06:51
  • @JeremyFriesner, I didn't know, thanks for the link :) – kaspersky May 08 '14 at 06:55
  • @JeremyFriesner: that's true for the data itself on the heap and becomes meaningful as the number of bits gets large, but if you're just storing a few bits then vector's overheads for a pointer to heap, size and capacity members etc. may give much worse than 8x overheads. (There could be an in-object optimisation for small size, but may not be, and the general concern stands). Anyway, this answer gets my +1 for mentioning bitfields. – Tony Delroy May 08 '14 at 06:56
0

use an integer storage (32 bits) where bit represent 1 variable. indeed this makes your code ugly but if you wish to have memory optimization, you have to pay somewhere else.

Accessing each variable's "bit" should be done by bit-wise operations on that integer.

NirMH
  • 4,769
  • 3
  • 44
  • 69