I need to create a very large array of bits/boolean values. How would I do this in C/C++?

Question

Is it even possible to create an array of bits with more than 100000000 elements? If it is, how would I go about doing this? I know that for a char array I can do this:

char* array;

array = (char*)malloc(100000000 * sizeof(char));

If I was to declare the array by char array[100000000] then I would get a segmentation fault, since the maximum number of elements has been exceeded, which is why I use malloc.

Is there something similar I can do for an array of bits?

Under what circumstances would you actually need that many bits in a single location? — Dennis Zickefoose, Mar 27 '10 at 18:04
Exact duplicate - same user - yesterday: http://stackoverflow.com/questions/2525310 — Paul R, Mar 27 '10 at 19:45
Sorry if the question seems similar, but it is in fact NOT exactly the same question. Previously I asked how to define an array of bits in general. Now I'm asking how to define one that is very large (i.e, more elements than is usually possible in a standard way of defining the array). — Eddy, Mar 28 '10 at 13:34
You're probably trying to create that on the stack. If you made a global that size it would probably work. If you need to make a data structure larger than will fit in size_t for your target you'll need to investigate storing parts of your data on secondary storage (in files). Depending on your usage pattern(s) there are several file structures that may work well for you. Look up "b tree". — nategoose, Mar 29 '10 at 01:55

score 12 · Accepted Answer · edited Oct 28 '12 at 15:56

12

If you are using C++, std::vector<bool> is specialized to pack elements into a bit map. Of course, if you are using C++, you need to stop using malloc.

edited Oct 28 '12 at 15:56

answered Mar 27 '10 at 18:00

Dennis Zickefoose

10,791
3
29
38

2

When using `vector` be aware that it does not staisfy the container concept of the STL, which may cause problems when using STL-algorithms. See http://www.sgi.com/tech/stl/bit_vector.html – Björn Pollex Mar 27 '10 at 18:05
I had forgotten std::vector was specialized. Good point. – cpalmer Mar 27 '10 at 18:06
What do you mean by specialized? – Eddy Mar 28 '10 at 14:05
@Eddy: I'd have to explain what a template is first. Have you used templates before? – Ben Voigt Mar 28 '10 at 17:39
@Eddy: Specialized (in this case) just means that for the datatype bool, vector doesn't use the generic vector implementation. This that would waste anywhere from 7 to 63 bits of data per entry on common architectures with common compilers. vector is implemented the same way I told you to do this the other day when you asked how to do it in C. – nategoose Mar 29 '10 at 02:00

score 8 · Answer 2 · answered Mar 27 '10 at 18:03

8

You could try looking at boost::dynamic_bitset. Then you could do something like the following (taken from Boost's example page):

boost::dynamic_bitset<> x(100000000); // all 0's by default
x[0] = 1;
x[1] = 1;
x[4] = 1;

The bitset will use a single bit for each element so you can store 32 items in the space of 4 bytes, decreasing the amount of memory required considerably.

answered Mar 27 '10 at 18:03

cpalmer

1,107
6
10

Will this work? Aren't most C++ compilers going to try to allocate x on a stack, which won't have room for such a large object? The stack is typically only 1MB or so. I think you'll probably have to use new to allocate it from the heap. – Charles E. Grant Mar 27 '10 at 18:11
3

@Charles: `x` will be on the stack, but the memory to store the bits will not be. Like `vector`, `boost::dynamic_bitset` takes an allocator template parameter, and by default allocates with `new`. There is no way in standard C++ to put anything on the stack whose size isn't known at compile time. – Steve Jessop Mar 27 '10 at 18:32
x is on the stack, but that doesn't mean x's members are. – Dennis Zickefoose Mar 27 '10 at 18:32
1

@Dennis: x's members are also on the stack... but this doesn't mean some of the members couldn't be pointers to dynamically allocated memory. – UncleBens Mar 27 '10 at 21:29

score 5 · Answer 3 · answered Mar 27 '10 at 18:00

In C and C++, char is the smallest type. You can't directly declare an array of bits. However, since an array of any basic type is fundamentally made of bits, you can emulate them, something like this (code untested):

unsigned *array;
array = (unsigned *) malloc(100000000 / sizeof(unsigned) + 1);

/* Retrieves the value in bit i */
#define GET_BIT(array, i) (array[i / sizeof(unsigned)] & (1 << (i % sizeof(unsigned))))

/* Sets bit i to true*/
#define SET_BIT(array, i) (array[i / sizeof(unsigned)] |= (1 << (i % sizeof(unsigned))))

/* Sets bit i to false */
#define CLEAR_BIT(array, i) (array[i / sizeof(unsigned)] &= ~(1 << (i % sizeof(unsigned))))

score 4 · Answer 4 · answered Mar 27 '10 at 18:04

The segmentation fault you noticed is due to running out of stack space. Of course you can't declare a local variable that is 12.5 MB in size (100 million bits), let alone 100MB in size (100 million bytes) in a thread with a stack of ~ 4 MB. Should work as a global variable, although then you may end up with a 12 or 100 MB executable file -- still not a good idea. Dynamic allocation is definitely the right thing to do for large buffers like that.

score 3 · Answer 5 · edited Oct 28 '12 at 15:55

3

If it is allowed to use STL, then I would use std::bitset.

(For 100,000,000 bits, it would use 100000000 / 32 unsigned int underneath, each storing 32 bits.)

std::vector<bool>, already mentioned, is another good solution.

edited Oct 28 '12 at 15:55

answered Mar 27 '10 at 18:55

Arun

19,750
10
51
60

vincentleest · Answer 6 · 2014-04-21T01:55:05.250

1

There are a few approaches to creating a bitmap in C++.

If you already know the size of bitmap at compile time, you can use the STL, std::bitset template.

This is how you would do it with bitset std::bitset<100000000> array

Otherwise, if the size of the bitmap changes dynamically during runtime, you can use std::vector<bool> or boost::dynamic_bitset as recommended here http://en.cppreference.com/w/cpp/utility/bitset (See note at the bottom)

edited Apr 21 '14 at 01:55

answered Apr 21 '14 at 01:20

vincentleest

925
1
8
18

Your answer doesn't contain any code and the link has no context. Please see [http://stackoverflow.com/help/how-to-answer](http://stackoverflow.com/help/how-to-answer). – A.L Apr 21 '14 at 01:46

score 0 · Answer 7 · answered Mar 27 '10 at 18:00

0

Yes but it's going to be a little bit more complicated !

The better way to store bits is to use the bits into the char itself !

So you can store 8 bits in a char !

Which will "only" require 12'500'000 octets !

Here is some documentation about binaries : http://www.somacon.com/p125.php

You should look on google :)

answered Mar 27 '10 at 18:00

Nicolas Guillaume

8,160
6
35
44

6

To you, everything is exciting! :) – John Dibling Mar 27 '10 at 18:11

score 0 · Answer 8 · answered Mar 27 '10 at 18:08

I need to create a very large array of bits/boolean values. How would I do this in C/C++?

9 Answers9