Large matrix creation in c++

Question

I have to create the following global matrix

float m[20][2000][1024][200]

that will occupy approximately 26 GB of RAM and this is not a problem.

I'm able to do it in GNU\Linux operating systems. How can I do it in Windows? Why does Windows impose this very annoying array dimension limitation?

One solution could be to allocate a single array in the heap and compute the 1d-index but I would prefer if it was the compiler to do it.

It is a dense matrix used in a dynamic programming algorithm and for performance purposes I would prefer contiguous memory for caching.

Any idea?

== UPDATE==

I'd like to post a solution that I eventually found. A possibility is as follows:

float (*m)[2000][1024][200];
m = malloc (20 * sizeof *m);

Memory is contiguous and could be accessed using the matrix access way.

If I were you I would get a real matrix library and use that instead. It should use dynamic memory so you can use as much ram as the system allows. — NathanOliver, Aug 29 '17 at 15:37
You might use `std::vector>>> m{20, std::vector>>(2000, std::vector>(1024, std::vector(200)))};` as "transparent" replacement (if you don't need contiguous storage). — Jarod42, Aug 29 '17 at 15:46
Statically allocated arrays are limited to `0x7fffffff` bytes (2GB) with the Windows Portable Executable (PE) object file format. On Linux64 I think the limit is `0x7fffffffffffffff` with ELF. — Z boson, Aug 30 '17 at 10:12

Basile Starynkevitch · Accepted Answer · 2017-08-29T16:04:32.860

4

One solution could be to allocate a single array and compute the 1d-index but I would prefer if it was the compiler to do it.

That is actually the good idea. Of course that array data should be allocated into heap (using operator new in C++, or malloc or calloc in C). and computing the offset from various indexes is easy.

You'll make that array some abstract data type; with C++ you might define your own class MyMatrix (but follow the rule of five) and you might build it above existing containers.

You probably should find a good existing matrix library. Some of them might even have optimization taking advantage of specific hardware (e.g. OpenMP or OpenCL based).

See also this answer for a C approach.

for performance purposes I would prefer contiguous memory for caching.

Such caching considerations only matter for the most inner loops of your computation. See also this.

edited Aug 29 '17 at 16:04

answered Aug 29 '17 at 15:37

Basile Starynkevitch

223,805
18
296
547

1

It would be interesting to answer this "I'm able to do it in GNU\Linux operating systems. How can I do it in Windows? Why does Windows impose this very annoying array dimension limitation?" Is there a limit to the size of statically allocated memory with Windows compared to Linux? That's an interesting question. – Z boson Aug 30 '17 at 08:54
I have no idea about Windows. I never used it. I'm using Linux since 1993 (both at home and at work). And I don't know if it a limitation of the Windows OS, or of the particular compiler the OP is using on it. – Basile Starynkevitch Aug 30 '17 at 08:55
What about Linux? The memory is defined in the BSS section of the object file. It's not allocated on the heap. Are these limited in size. I thought maybe ELF limited these to 2 GB or something? – Z boson Aug 30 '17 at 08:57
OP should not allocate a big data in BSS. It should be heap allocated. – Basile Starynkevitch Aug 30 '17 at 08:58
I agree it should be done on the heap but I still think it's an interesting question about possible limitations allocating in BSS. BSS allows you do use addressing modes (on Linux) that are not possible with the heap. – Z boson Aug 30 '17 at 08:59
With static arrays you can do `[absolute 32-bit address + index]`. This is not possible with the heap. But then I think this limits the size of the static array. – Z boson Aug 30 '17 at 09:02
Since OP wants a 26Gbyte data, he has to use a 64 bits processor with a 64 bits virtual address space with a quite big computer (big desktop, server or supercomputer) – Basile Starynkevitch Aug 30 '17 at 09:06
`[absolute 32-bit address + index]` is actually how GCC and Intel address static arrays on 64-bit Linux. See the section "Addressing static arrays in 64 bit mode" in Agner's manual http://www.agner.org/optimize/optimizing_assembly.pdf So I'm still wondering what the size limits might be. I found this https://stackoverflow.com/a/18372236/2542702 but it does not give any hard numbers. – Z boson Aug 30 '17 at 09:13
According to this static arrays on Windows are limited to 2 GB https://software.intel.com/en-us/articles/memory-limits-applications-windows. I don't know what limit there is on Linux. – Z boson Aug 30 '17 at 09:17

Large matrix creation in c++

1 Answers1

Linked