-1

I built a project based on C language with x86 configuration, and if I put over around 3GB stack it caused to malfunction or incorrect result.

What is the maximum usable memory size on x86 configuration?

(Development Environment has enough memory; Windows 64bit, 16GB RAM.)

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • 1
    According to [WOW64 documentation](https://learn.microsoft.com/en-us/windows/win32/winprog64/memory-management): "x64 WOW64 supports a 4 GB virtual address space". In particular, 64-bit Windows running a 32-bit application should not require precious virtual address space for kernel structures, since the kernel runs in 64-bit mode through 64-bit thunks. – nanofarad Apr 24 '20 at 04:18
  • *Development Environment has enough memory; Windows 64bit, 16GB RAM* -- That extra RAM does nothing if your application is 32-bit. A 64-bit app can use all of that memory. – PaulMcKenzie Apr 24 '20 at 04:34
  • 1
    @PaulMcKenzie: Pretty sure the OP's point is just that there is enough physical memory to back the full 4GiB of a 32-bit virtual address space that a 32-bit process could even possibly use, so that wouldn't be a concern. The question's not very clear about what compiler or what target OS they're compiling for. Presumably also Windows. – Peter Cordes Apr 24 '20 at 04:36
  • If you actually need a 3GB stack there is something wrong anyway. – Jabberwocky Apr 24 '20 at 06:50
  • 2
    @PeterCordes to the best of my understanding, all of the ntdll system calls on wow64 involve a call to 32-bit ntdll, a transition into long mode into the code of the 64-bit ntdll, and then a system call. [source](https://medium.com/@fsx30/hooking-heavens-gate-a-wow64-hooking-technique-5235e1aeed73) – nanofarad Apr 24 '20 at 13:36
  • Note also that the stack size applies to every thread. So two threads will use 6GB stack, etc. Setting the stack to 3GB means you'll never be able to create a second thread, which is a problem since the system is probably going to want to create threads. – Raymond Chen Apr 24 '20 at 18:53
  • You need to tell us why you need a 3GB stack. This is really strange, or are you mixung up stack, memory, dynamic allocation or whatever? Please [edit] your question and clarify. Also read about the [XY Problem](http://xyproblem.info/) – Jabberwocky Apr 24 '20 at 19:14
  • 1
    @nanofarad: Thanks, I hadn't realized Windows used that clunky (thunky?) design. Linux has the 32-bit ABI directly supported by the kernel so 32-bit user-space can use `int 0x80` or `sysenter` directly, without having to do a slow far jmp in user-space first. Which is ironic because 32-bit code is much more widely used on Windows than Linux. (So I expected a full-efficiency way for 32-bit code to make WinAPI system calls.) – Peter Cordes Apr 24 '20 at 19:54
  • Thank a lot, I clearly understood what is a limitation on this system. actually I used a several threads then each needs around 500MB stack for running (I can't handle this design scheme because it was from another developer...). In this case up to 6 threads are available. – yangcooler Apr 27 '20 at 08:11

2 Answers2

1

Under a 64-bit kernel, a 32-bit process can use the entire 4GiB virtual address space, minus overhead, if it's built as a "large-address aware" executable. Otherwise only 2GiB.

This is not on by default for compat with code that makes unsafe assumptions. The developer must pass a flag to the linker when building the executable. The flag is /LARGEADDRESSAWARE on MSVC and --large-address-aware on MinGW.

When a large address aware program is running on a 32-bit edition of Windows with the /3GB switch enabled, it will be able to map at most 3 GiB of virtual address space (the remaining high 1 GB is reserved for the kernel). However, on a 64-bit Windows system, it should be able to map all 4 GB, less some overhead.

The size of one single contiguous allocation will be limited by where the main executable and DLLs get mapped into memory (and the stack and any other random allocations), because of course it has to go between any pages that are already in use.


Without large-address-aware, a 32-bit program running on Windows (whether 32-bit or 64-bit) will only have 2 GB of virtual address space by default, no matter how much virtual address space Windows has available after subtracting kernel address usage and miscellaneous overhead.

In particular, your program will never receive a user-mode virtual address mapping above the 2 GB mark unless it opts itself into receiving such high addresses by declaring itself as large-address aware.

Historically, 32-bit Windows used to use a 2G:2G split of virtual address space between kernel and user-space. Some programs might depend on the difference between two pointers to different objects fitting in a signed positive integer, or other assumptions that ISO C doesn't guarantee and LAA would break. Non-large-address-aware ensures backwards compatibility with such programs. (Drawbacks of using /LARGEADDRESSAWARE for 32 bit Windows executables?)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
nanofarad
  • 40,330
  • 4
  • 86
  • 117
  • Your answer starts off apparently saying that *all* programs are limited to 2GiB. But that's only true without `/LARGEADDRESSAWARE`. Some programs *are* built that way. It's eventually clear what you're saying, but there's probably a better way to phrase this. – Peter Cordes Apr 24 '20 at 22:16
  • @PeterCordes revised. – nanofarad Apr 24 '20 at 22:18
  • 1
    I reordered your paragraphs and added a new summary paragraph at the top. I think that gets the key points across sooner, then fills in the details. Feel free to revert or edit further, this was just the easiest way to show what I had in mind for presenting the facts. – Peter Cordes Apr 24 '20 at 22:50
  • 1
    Thanks, much appreciated. I've been tied up with other matters that arose suddenly so I did not have time to make such a detailed edit. – nanofarad Apr 24 '20 at 22:51
  • Yes I already put the /LARGEADDRESSAWARE to occupy over than 2GiB memory area. Finally I understood what a limitation on my system from your detailed explanation. Have a nice day :) – yangcooler Apr 27 '20 at 08:22
0

your problem is not the amount of memory usable in windows. You have requested 3Gb only for stack (not memory, just stack). That's very uncommon.

For example, a normal limitation on FreeBSD (not windows) for a process stack is this:

$ ulimit -a
number of pseudoterminals            (-P) unlimited
socket buffer size       (bytes, -b) unlimited
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) 33554432
file size               (blocks, -f) unlimited
max kqueues                     (-k) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 230121
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 524288     <<<<<<<<<<<<<<<
cpu time               (seconds, -t) unlimited
max user processes              (-u) 12042
virtual memory          (kbytes, -v) unlimited
swap size               (kbytes, -w) unlimited

As you see, stack is limited by the system to 524288 Kilobytes so probably on this system you'll fail also.

It's common your system has some trouble giving you such amount of memory to stack. All operating systems limit the

Think that it is common to have a limit on the order of 10Mb for stack per process... so you had probably some design issue.

What are you using such stack amount in your program?

(Sorry, neither I have a close windows machine to check this, nor I know how to check the maximum amount allowed for stack segment)

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
  • Thank you for your answer, also I knew 3GB stack is uncommon, but in this case I should use around 3GiB stack then I caught what is a limitation. – yangcooler Apr 27 '20 at 08:25
  • what are you using such an amount of stack space? I have never reached that amount. I've used over 17Gb virtual memory but never such an amount of stack. – Luis Colorado Apr 27 '20 at 16:37
  • @ Luis Colorado on my app, I invoked pthread library and each thread handle needs up to 500MB independent memory space for variables. But as you know x86 configuration has a limitation for image size under 0x80000000, so I put the attribute memory space for each thread to create internal variable rather than to create global variable (if I use global variable and create 6 threads, image size will be exceed 0x80000000). – yangcooler Apr 28 '20 at 01:16
  • stack for threads is normally allocated from the heap, so it is not part of the stack segment... Anyway, 500mb for the stack of each thread is also a lot of stack. – Luis Colorado Apr 29 '20 at 13:03