20

Most programs fits well on <4GB address space but needs to use new features just available on x64 architecture.

Are there compilers/platforms where I can use x64 registers and specific instructions but preserving 32-bits pointers to save memory?

Is it possible do that transparently on legacy code? What switch to do that?

OR

What changes on code is it necessary to get 64-bits features while keep 32-bits pointers?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Maniero
  • 10,311
  • 6
  • 40
  • 85

10 Answers10

18

A simple way to circumvent this is if you'd have only few types for your structures that you are pointing to. Then you could just allocate big arrays for your data and do the indexing with uint32_t.

So a "pointer" in such a model would be just an index in a global array. Usually addressing with that should be efficient enough with a decent compiler, and it would save you some space. You'd loose other things that you might be interested in, dynamic allocation for instance.

Another way to achieve something similar is to encode a pointer with the difference to its actual location. If you can ensure that that difference always fits into 32 bit, you could gain too.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • 3
    Note that with these solutions, you get to access up to 2^32 words of memory, which is more than the 2^31 bytes (2^29 32-bit words) that you typically are able to allocate in a 32-bit process on a modern OS. So it's also very attractive to take full advantage of the 8 or 16 GiB of memory typically found in workstations as of 2010 while keeping 32-bit "pointers". – Pascal Cuoq Nov 07 '10 at 17:33
7

It's worth noting that there an ABI in development for linux, X32, that lets you build a x86_64 binary that uses 32 bit indices and addresses.

Only relatively new, but interesting nonetheless.

http://en.wikipedia.org/wiki/X32_ABI

jsimmons
  • 732
  • 1
  • 8
  • 18
5

Technically, it is possible for a compiler to do so. AFAIK, in practice it isn't done. It has been proposed for gcc (even with a patch here: http://gcc.gnu.org/ml/gcc/2007-10/msg00156.html) but never integrated (at least, it was not documented the last time I checked). My understanding is that it needs also support from the kernel and standard library to work (i.e. the kernel would need to set up things in a way not currently possible and using the existing 32 or 64 bit ABI to communicate with the kernel would not be possible).

AProgrammer
  • 51,233
  • 8
  • 91
  • 143
  • For those interested, there seem to be movements on that front: http://gcc.gnu.org/ml/gcc/2010-12/msg00480.html – AProgrammer Jan 01 '11 at 11:54
4

What exactly are the "64-bit features" you need, isn't that a little vague?

Found this while searching myself for an answer: http://www.codeproject.com/KB/cpp/smallptr.aspx

Also pick up the discussion at the bottom...

Never had any need to think about this, but it is interesting to realize that one can be concerned with how much space pointers need...

steabert
  • 6,540
  • 2
  • 26
  • 32
  • It seems very interesting although probably it's not what I'm looking for. I don't need use more than 4GB. I need more and longer registers and if possible, advanced instructions just available on 64 bits systems. Anyway it's an useful answer. – Maniero Nov 07 '10 at 22:54
  • @bigown: but then again, if you say that you only need a few millions of pointers, does it really matter if they occupy 40Mb or 80Mb? – steabert Nov 08 '10 at 10:58
  • No idea actually, I don't deal with pointer-based programs. But why would the size of the pointer affect the locality of the data they reference? (well, my understanding of the concept of locality is very basic, so if it's obvious you can enlighten me) – steabert Nov 08 '10 at 17:09
3

It depends on the platform. On Mac OS X, the first 4 GB of a 64-bit process' address space is reserved and unmapped, presumably as a safety feature so no 32-bit value is ever mistaken for a pointer. If you try, there may be a way to defeat this. I worked around it once by writing a C++ "pointer" class which adds 0x100000000 to the stored value. (This was significantly faster than indexing into an array, which also requires finding the array-base address and multiplying before the addition.)

On the ISA level, you can certainly choose to load and zero-extend a 32-bit value and then use it as a 64-bit pointer. It's a good feature for a platform to have.

No change should be necessary to a program unless you wish to use 64-bit and 32-bit pointers simultaneously. In that case you are back to the bad old days of having near and far pointers.

Also, you will certainly break ABI compatibility with APIs that take pointers to pointers.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
3

I think this would be similar to the MIPS n32 ABI: 64-bit registers with 32-bit pointers.

In the n32 ABI, all registers are 64-bit (so requires a MIPS64 processor). But addresses and pointers are only 32-bit (when stored in memory), decreasing the memory footprint. When loading a 32-bit value (such as a pointer) into a register, it is sign-extended into 64-bits. When the processor uses the pointer/address for a load or store, all 64-bits are used (the processor is not aware of the n32-ess of the SW). If your OS supports n32 programs (maybe the OS also follows the n32 model or it may be a proper 64-bit OS with added n32 support), it can locate all memory used by the n32 application in suitable memory (e.g. the lower 2GB and the higher 2GB, virtual addresses). The only glitch with this model is that when registers are saved on the stack (function calls etc), all 64-bits are used, there is no 32-bit data model in the n32 ABI.

Probably such an ABI could be implemented for x86-64 as well.

  • Indeed, Linux x32 is an ILP32 ABI for x86-64. See other answers and https://en.wikipedia.org/wiki/X32_ABI. There's also an ILP32 ABI for AArch64. – Peter Cordes Oct 24 '20 at 03:05
2

On x86, no. On other processors, such as PowerPC it is quite common - 64 bit registers and instructions are available in 32 bit mode, whereas with x86 it tends to be "all or nothing".

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • 1
    Probably no one would think about this at that time but it's actually yes [32-bit pointers with the x86-64 ISA: why not?](https://stackoverflow.com/q/9233306/995714) – phuclv Oct 24 '20 at 01:15
1

Linux now has fairly comprehensive support for the X32 ABI which does exactly what the asker is asking, in fact it is partially supported as a configuration under the Gentoo operating system. I think this question needs to be reviewed in light of resent development.

Vality
  • 6,577
  • 3
  • 27
  • 48
1

I'm afraid that if you are concerned about the size of pointers you might have bigger problems to deal with. If the number of pointers is going to be in the millions or billions, you will probably run into limitations within the Windows OS before you actually run out of physical or virtual memory.

Mark Russinovich has written a great article relating to this, named Pushing the Limits of Windows: Virtual Memory.

jveazey
  • 5,398
  • 1
  • 29
  • 44
  • I will read it. Billion of pointers certainly no (just pointers would fill the address space), but some few millions it's possible. – Maniero Nov 07 '10 at 10:44
0

The second part of your question is easily answered. It is very possible, in fact many C implementations have support, for 64-bit operations using 32-bit code. The C type often used for this is long long (but check with your compiler and architecture).

As far as I know it is not possible to have 32-bit pointers in 64-bit native code.

PP.
  • 10,764
  • 7
  • 45
  • 59
  • 1
    "64 bit features" on x86-64 means fully *efficient* support for `int64_t` (single instruction), a newer calling convention, and twice as many integer registers as 32-bit mode. Also guaranteed availability of SSE2. Of course it's possible to compute anything, even on an 8-bit machine, it just takes more instructions to do extended-precision stuff. Addition is still easy, just add/adc, but multiplication and especially division take more code. – Peter Cordes Oct 24 '20 at 03:04
  • 1
    Another 64-bit feature is convenient atomicity for 64-bit types, like `atomic>` (note that pointer + counter only fits in 64 bits total with 32-bit pointers, so the best case for this is an ILP32 ABI for 64-bit mode). In 32-bit mode x86, you need SSE or MMX to do a 64-bit load or store, or `lock cmpxchg8b`. If we're talking non-x86, e.g. AArch64 vs. 32-bit ARM, then AArch64 again has more registers (integer and SIMD), and some different choices of instructions. – Peter Cordes Oct 24 '20 at 03:08