1

How much memory can NASM allocate on 64-bit system (Windows 7)? It seems fasm can only allocate about 1.5 GB of memory, so I'm in search of more capable assembler.

DSblizzard
  • 4,007
  • 7
  • 48
  • 76
  • 1
    What are you writing that requires the *assembler* to need a gig and a half? – paxdiablo Nov 16 '19 at 09:33
  • NP-complete problems and natural language processing easily can eat terabytes of RAM. – DSblizzard Nov 16 '19 at 09:34
  • 1
    Yes, but that's something you need to do at runtime, not assembly time. Surely that's limited only by the OS API calls you use, no? – paxdiablo Nov 16 '19 at 09:38
  • Indeed. I used static array, but overlooked this simple solution. Thank you, this solved my problem, so you can turn comment to answer if you want. – DSblizzard Nov 16 '19 at 09:51
  • 4
    Unless you allocate things dynamically at run time, you'll run into the 2GB limit of the PECOFF executable format that Windows uses regardless of the assembler you use. – Ross Ridge Nov 16 '19 at 16:05

2 Answers2

5

I'm having a hard time figuring out what you could be doing that would require that much memory at assembly time. I can only assume you're allocating a large chunk of initialised data in your code.

If it's something that's only needed at runtime, the OS you're using should provide some form of dynamic allocation which won't affect your code/data size.

An example is the Windows HeapAlloc function, you just have to ensure you follow the calling conventions when calling it from assembly.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
2

NASM has no problem with gigantic arrays in the BSS. (Zero-initialized without taking space in the executable itself).

Dynamic allocation isn't better than static for a tiny program, but keep in mind that unless you put this array at the end of the BSS, you won't be able to do RIP-relative addressing of other variables that end up above it; they'll be more than +-2GiB away from your code.

I think the BSS can still use transparent hugepages on Linux the same way dynamic allocation (mmap / VirtualAlloc) can, but this is something to double-check. You definitely want that for huge arrays.

There can be a small efficiency boost in having the base of your array at a statically-known address, allowing addressing modes like [array + rdi] instead of [rsi + rdi] tying up another register, and indexed addressing modes defeat micro-fusion in some cases on Sandybridge-family (including for almost all AVX ALU+load instructions that can micro-fuse a load in the first place, on Haswell/Skylake.) Micro fusion and addressing modes


This does seem to be a problem for FASM which I ran into recently (probably when looking at one of your earlier questions). It seems x86-64 Linux FASM insists on using mmap(MAP_32BIT) according to strace output, and yes it seemed to fail when trying to use more than about 1GiB.

And that FASM wants to map the executable layout into memory(?) including the BSS, making it impossible to use arrays bigger than it has assemble-time virtual address space.


  lea  rdi, [rel big]
  mov eax, 231 
  syscall                 ; exit_group(low byte of address)

section .bss
big: resb 1024*1024*1024*40      ; 40GiB

NASM + ld assemble this without complaint on GNU/Linux. But readelf -a on the resulting static executable says the "memsiz" of the BSS is only "0x0000000100000000" (4GiB) not 40GiB.

This might be NASM's fault: readelf -a on the .o produced by NASM shows

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
    ...
  [ 6] .bss              NOBITS           0000000000000000  00000000
       00000000ffffffff  0000000000000000  WA       0     0     4
    ...

That's UINT32_MAX for the BSS size; much smaller than 40GiB.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847