You're doing signed compares on pointers; don't do that. Use jne
in this case since you will always reach exact equality at the exit point.
Or if you want relational compares with pointers, usually unsigned conditions like jb
and jae
make the most sense. (It's normal to think of virtual address space as a flat linear 4GiB with the lowest address being 0, so you need increments across the middle of that range to work).
With arrays larger than your ~300MiB size, and the default linker script for PIE executables, apparently one of them will span the 2GiB boundary between signed-positive and signed-negative1. So the end-pointer you calculate will be "negative" if you treat it as a signed integer. (Unlike on x86-64, where the non-canonical "hole" spanning the middle of virtual address-space means that an array can never span the signed-wraparound boundary: Should pointer comparisons be signed or unsigned in 64-bit x86? - sometimes it does make sense to use signed compares there.)
You should see this with a debugger if you single-step and look at the pointer values, and the memory value you create with size += dest
(add [esp + 12], eax
). As a signed operation, that overflows to create a negative end_pointer, while the start pointer is still positive. pos < neg
is false on the first iteration, so your loop exits, you can see this when single-stepping.
Footnote 1: On my system, under GDB (which disables ASLR), after start
to get the executable mapped to Linux's default base address for PIEs (2/3 of the way into the low half of the address space, i.e. 0x5555...), I checked the addresses with your test case:
sr
at 0x56559040
ds
at 0x6a998d40
- end of
ds
at p /x sizeof(ds) + ds
= 0x7edd8a40
So if it were much bigger, it would cross 0x80000000
. That's why 340000000
avoids your bug but larger sizes reveal it.
BTW, under a 32-bit kernel, Linux defaults to a 3:1 split of address space between kernel and user-space, so even there it's possible for this to happen. But under a 64-bit kernel, 32-bit processes can have the entire 4 GiB address space to themselves. (Except for a page or two reserved by the kernel: see also Why can't I mmap(MAP_FIXED) the highest virtual page in a 32-bit Linux process on a 64-bit kernel?. That also means that forming a pointer to one-past-end of any array like you're doing (which ISO C promises is valid to do), won't wrap around and will still compare above a pointer into the object.)
This won't happen in 64-bit mode: there's enough address space to just divide it evenly between user and kernel, as well as there being a giant non-canonical hole between high and low ranges.