I'm trying to define a matrix in C with 100,000 columns and 100,000 lines of boolean type (bool matrix[100000][100000]), how I do that?
-
8`static bool matix[100000][100000];` – BLUEPIXY Jun 01 '15 at 13:32
-
2And you intend to use every element? – Joe Jun 01 '15 at 13:32
-
How much memory do you have? – unwind Jun 01 '15 at 13:32
-
1Note that this will take more than a GB of memory – aochagavia Jun 01 '15 at 13:32
-
2Assuming one byte per `bool`, this will occupy around 9.3 GB of memory, which is quite a lot. You can't expect to put that much data on the stack (where `auto` variables typically live), which is why `static` helps. – unwind Jun 01 '15 at 13:35
-
@aochagavia That's true, but not very precise. :) – unwind Jun 01 '15 at 13:36
-
1We'd really need to know *why*. Something that big could have performance issues as well as pure storage problems. – Joe Jun 01 '15 at 13:36
-
1one `byte` per `bool` is madness. You have (usually) 8 `bits` per-`byte` which allows for 8 booleans in one single `byte`. – Eregrith Jun 01 '15 at 13:37
-
3Many applications that use such large data structures typically only use a fraction of the elements, with the rest remaining set to zero. If this is the case for you, I strongly recommend you look into [sparse matrix](https://en.wikipedia.org/wiki/Sparse_matrix) techniques. – rojomoke Jun 01 '15 at 13:39
-
1Brute force, use a machine with lots of gigs of memory and 64bit architecture, and then do as BLUPEPIXY suggests. Where resources are limited, consider a memory mapped file. Performance can be an issue. Usually very large matrices are implemented as sparce matrix – Les Jun 01 '15 at 13:42
3 Answers
Dynamic allocation is needed here. So you need 100000*100000 = 10 billion bits. That's a lot.
10,000,000,000 bits == 1,250,000,000 bytes == 1.164 gigabytes.
So a bit over a gigabyte.. It isn't impossible, but you have to be careful. The best way to do it would be to allocate it in a single array, like this:
uint8_t *matrix = malloc(1250000000);
if (matrix == NULL) // This will very likely fail, so be sure to handle it..
return ENOMEM;
You now only have a single array. To access an element [i][j]
, you'd do i*10000+j
. However, since we have allocated an array of uint8_t
, you need to extract the bits you want.
To do that, you could have accessor macros, like these:
// You probably want to add matrix as a param for these macros
#define index(i, j) (i*10000+j)
#define elem(i,j) (matrix[index(i,j)/8])
#define get(i, j) (elem(i,j) & (1 << index(i,j)%8))
#define set(i,j,v) (v ? elem(i,j) |= (1 << index(i,j)%8) : elem(i,j) &= ~(1 << index(i,j)%8))
set(5000, 5000, TRUE);
if (get(5000, 5000))
doStuff();

- 2,872
- 11
- 26
-
@undefinedbehaviour 100k row and 100k columns is 10B total. Since he only uses bools, that is one bit per entry. 10B/8 is 1.25B. If we allocated a full byte for every bool, it would be almost 10 gigabytes, which is too much. This way it is just a bit over a gigabyte. – mtijanic Jun 01 '15 at 13:48
-
@undefinedbehaviour Technically, I used `uint8_t` which is 8. :P I only assume that a *byte* is 8 bits, when I did the calculations. – mtijanic Jun 01 '15 at 14:11
-
-
I just noticed an error in your `elem` macro... You're missing a `]`. – autistic Jun 01 '15 at 14:16
-
On a technical note, a definition is quite different to a declaration. I figure you're asking for a type synonym.
typedef bool matrix[100000][100000];
The type matrix
is now defined as a type synonym for bool[100000][100000]
.
matrix *m = malloc(sizeof *m);
m
is declared to be a pointer to matrix
, and initialised to the return value of a suitable malloc
call.
static matrix s;
s
is declared to be a matrix
with static storage duration, as suggested within the comments.
This is really quite inefficient. If you don't plan on using all of those elements, perhaps you could try using an ordered map of some kind. I've written one you could use, called a PATRICIA trie, which you can find here. If you do plan on using all of those elements, then you could save a lot of memory (and time, in the form of cache misses) by making use of all of the bits within the array, rather than typically 1/8th or worse of them, the waste being in the form of padding. This would require some slightly more complex logic.
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef unsigned char matrix[100000][100000 / CHAR_BIT + (100000 % CHAR_BIT > 0)];
int matrix_get(matrix *m, size_t x, size_t y) {
return ((*m)[x][y / CHAR_BIT] >> (y % CHAR_BIT)) & 1;
}
void matrix_set(matrix *m, size_t x, size_t y, int value) {
(*m)[x][y / CHAR_BIT] &= ~(1U << (y % CHAR_BIT));
(*m)[x][y / CHAR_BIT] |= value << (y % CHAR_BIT) ;
}
int main(void) {
matrix *m = malloc(sizeof *m);
memset(m, 0, sizeof *m);
matrix_set(m, 0x1337, 0xC0DE, 1);
printf("%d\n", matrix_get(m, 0x1337, 0xC0DE));
}

- 1
- 3
- 35
- 80
If you are willing to do some bit masking, you can allocate int matrix[100000][1000000/(sizeof(int)*CHAR_BIT)+1]
words. Your program will be slower, but you would allocate 1/8th the amount of memory. See How do you set, clear, and toggle a single bit? for bit masking.
If the size of memory is too large for the stack or heap, you might consider memory mapped IO.

- 1
- 1

- 3,266
- 1
- 20
- 30
-
-
-
That's a bit better... Now, supposing `sizeof(int)` is some odd number like 3, you might be missing a few bits... – autistic Jun 01 '15 at 14:31
-
Changed answer based on undefined behavior's comment. 8 bits per byte. – Robert Jacobs Jun 01 '15 at 14:31
-
`100000/3 == 33333`... `333333/8 == 4166`... `4166+1 == 4167`... So there are 4167 `int`s in each row/column/whatever you want to call it. If we multiply that by 8, then by 3 again we should reach *at least* `100000`, but alas, you're not quite there yet :( – autistic Jun 01 '15 at 14:37