-1

In linux, the range of virtual memory address in userspace, in other words, the range of value returned by malloc, is same as entire 64bits virtual memory space? Or, is there any sub-range of 64bits virtual memory space which is guaranteed not to be seen in userspace?

Answers for UNIX system or Windows system are welcome.

Of course I don't intend to introduce such buggy codes depending on this into some production. I just imagine that if there is a spare space, we can use the space to store flgas, for example is_constructed flag for lazy construction, and can spare much space. Usually, 1byte (stack) allocation is needed even if we use only 1bit. In addition, such value encoding into memory address may cause wrong memory prefetch or branch misprediction. Then, I want to check which of saving memory-bandwidth and CPU misprediction is larger.

akakatak
  • 575
  • 3
  • 14

1 Answers1

3

The possible range of virtual addresses is processor and kernel and ABI specific. But I strongly advise against coding some tricks related to some bits in addresses and pointers (since your code might work on some processors, but not on others).

These days, some * x86-64 processors apparently use only 48 bits of the virtual address space, but I don't recommend using that knowledge in your code (it might be wrong within a few years, or on some higher end models). See also x86-64 ABI.

If a pointer is outside that 48 bits range, you get a page fault, i.e. a SIGSEGV, the processor is not ignoring the unimplemented bits. So upper bits of pointers or addresses should be all zeros or all ones.

On Linux, you might play with cat /proc/self/maps and cat /proc/$$/mapsto get more clues.

BTW, on Linux, you could reserve, using mmap(2) with MAP_ANONYMOUS | MAP_NORESERVE, some large address range (and either later call mmap with MAP_FIXED inside it, or never use it) to avoid it being later used by your process (or, as commented by damon, use MAP_32BITS which is x86-64 specific) This is probably more sensible than relying on the address bits to be restricted to 48 bits.

Also, catching SIGSEGV on Linux is tricky (you'll need some processor specific code) and costly. Perhaps you want some external pager mechanism (which exists on GNU Hurd, but not on Linux). Or mmap-ing some pseudo-file on some FUSE filesystem.

NB: most x86-64 processors have only 48 bits of addresses, but I don't recommend using that.

Note 2: Processor makers remembered what IBM/360 did: ignoring the upper address bits (originally 24 bits). When IBM had to extend address to 31 bits it was a nightmare for the software industry. So hardware makers understood the lesson, and disallow today (in hardware) playing naughty tricks on unused address bits.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 2
    Note: In addition to using `MAP_FIXED` you could also use `MAP_32BIT`, which will guarantee that the upper 32 bits will be zero (effectively returning a 32 bit pointer). – Damon Jun 17 '15 at 08:34
  • 1
    Basile: your comments on the questions, and things in this answer, presume encoding of extra data in the high-order bits of a pointer that's then "used", in the sense of dereferencing / pointer arithmetic etc.. I agree that's bad. It's still plausibly useful to know that pointers only vary 48 of their 64 bits so you can store a pointer value alongside 16 other bits of information in a `uint64_t`, masking as appropriate to extract the original pointer value before use. If feeling lucky, you could even experiment with storing such values in a pointer type, though that's considerably riskier. – Tony Delroy Jun 17 '15 at 10:04