2

I have to create the following global matrix

float m[20][2000][1024][200]

that will occupy approximately 26 GB of RAM and this is not a problem.

I'm able to do it in GNU\Linux operating systems. How can I do it in Windows? Why does Windows impose this very annoying array dimension limitation?

One solution could be to allocate a single array in the heap and compute the 1d-index but I would prefer if it was the compiler to do it.

It is a dense matrix used in a dynamic programming algorithm and for performance purposes I would prefer contiguous memory for caching.

Any idea?

== UPDATE==

I'd like to post a solution that I eventually found. A possibility is as follows:

float (*m)[2000][1024][200];
m = malloc (20 * sizeof *m);

Memory is contiguous and could be accessed using the matrix access way.

acco93
  • 128
  • 2
  • 11
  • Why such a big matrix? Is it sparse? – Ed Heal Aug 29 '17 at 15:37
  • If I were you I would get a real matrix library and use that instead. It should use dynamic memory so you can use as much ram as the system allows. – NathanOliver Aug 29 '17 at 15:37
  • You might use `std::vector>>> m{20, std::vector>>(2000, std::vector>(1024, std::vector(200)))};` as "transparent" replacement (if you don't need contiguous storage). – Jarod42 Aug 29 '17 at 15:46
  • 2
    @Jarod42: in that case I think it is a bad idea. – Basile Starynkevitch Aug 29 '17 at 15:50
  • Statically allocated arrays are limited to `0x7fffffff` bytes (2GB) with the Windows Portable Executable (PE) object file format. On Linux64 I think the limit is `0x7fffffffffffffff` with ELF. – Z boson Aug 30 '17 at 10:12

1 Answers1

4

One solution could be to allocate a single array and compute the 1d-index but I would prefer if it was the compiler to do it.

That is actually the good idea. Of course that array data should be allocated into heap (using operator new in C++, or malloc or calloc in C). and computing the offset from various indexes is easy.

You'll make that array some abstract data type; with C++ you might define your own class MyMatrix (but follow the rule of five) and you might build it above existing containers.

You probably should find a good existing matrix library. Some of them might even have optimization taking advantage of specific hardware (e.g. OpenMP or OpenCL based).

See also this answer for a C approach.

for performance purposes I would prefer contiguous memory for caching.

Such caching considerations only matter for the most inner loops of your computation. See also this.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    It would be interesting to answer this "I'm able to do it in GNU\Linux operating systems. How can I do it in Windows? Why does Windows impose this very annoying array dimension limitation?" Is there a limit to the size of statically allocated memory with Windows compared to Linux? That's an interesting question. – Z boson Aug 30 '17 at 08:54
  • I have no idea about Windows. I never used it. I'm using Linux since 1993 (both at home and at work). And I don't know if it a limitation of the Windows OS, or of the particular compiler the OP is using on it. – Basile Starynkevitch Aug 30 '17 at 08:55
  • What about Linux? The memory is defined in the BSS section of the object file. It's not allocated on the heap. Are these limited in size. I thought maybe ELF limited these to 2 GB or something? – Z boson Aug 30 '17 at 08:57
  • OP should not allocate a big data in BSS. It should be heap allocated. – Basile Starynkevitch Aug 30 '17 at 08:58
  • I agree it should be done on the heap but I still think it's an interesting question about possible limitations allocating in BSS. BSS allows you do use addressing modes (on Linux) that are not possible with the heap. – Z boson Aug 30 '17 at 08:59
  • With static arrays you can do `[absolute 32-bit address + index]`. This is not possible with the heap. But then I think this limits the size of the static array. – Z boson Aug 30 '17 at 09:02
  • Since OP wants a 26Gbyte data, he has to use a 64 bits processor with a 64 bits virtual address space with a quite big computer (big desktop, server or supercomputer) – Basile Starynkevitch Aug 30 '17 at 09:06
  • `[absolute 32-bit address + index]` is actually how GCC and Intel address static arrays on 64-bit Linux. See the section "Addressing static arrays in 64 bit mode" in Agner's manual http://www.agner.org/optimize/optimizing_assembly.pdf So I'm still wondering what the size limits might be. I found this https://stackoverflow.com/a/18372236/2542702 but it does not give any hard numbers. – Z boson Aug 30 '17 at 09:13
  • According to this static arrays on Windows are limited to 2 GB https://software.intel.com/en-us/articles/memory-limits-applications-windows. I don't know what limit there is on Linux. – Z boson Aug 30 '17 at 09:17