Can we change virtual memory address of a block of data without accessing the values in it?

Question

So, In any high level language like C++ we can use pointer and references to check address of variables but can we actually change it? E.g lets just say int A has address of 1000h and int B has address of 1004h and following is representation in memory:-

1000h	1004h
A	B
100	104

Can I just interchange their address? (Which will look like this).

1004h	1000h
A	B
100	104

Is this even possible?

Note: Please consider 1000h as virtual address and 100 as actual address and you can use whatever programming language you want.

If I'm getting it wrong then please let me know.

score 2 · Answer 1 · answered Jun 29 '21 at 07:58

2

No, you cannot change an address of an object. Each object has the same region of storage during its entire lifetime.

answered Jun 29 '21 at 07:58

Daniel Langr

22,196
3
50
93

1

Virtual memory means you can plug different storage into the same region, though. It won't change the variable name -> address mapping, but it will change the contents of memory at that address. (See my answer.) – Peter Cordes Jun 29 '21 at 18:28
@PeterCordes The content of memory can be changed by many ways even in C++ itself, such as with casting to `std::byte` array or using `memcpy` (for supported types). It doesn't seem to be OP was asking about this problem. – Daniel Langr Jun 30 '21 at 03:58
Agreed, the question seems confused, but there is interesting stuff you can describe as having the contents of `a` be different memory, specifically the previous contents of a different array. – Peter Cordes Jun 30 '21 at 04:05

Peter Cordes · Accepted Answer · 2021-06-29T18:25:10.160

You can't change &a, but you can change the contents of a by playing around with virtual memory. But not just int a; without copying actual memory contents around, you can only change whole page-sized chunks.

You could potentially swap these two arrays with virtual memory trickery, as an alternate way of doing std::swap_range(a, a+1024, b) which may or may not be faster.

alignas(4096) int32_t a[1024];     // assuming 4k page size and
alignas(4096) int32_t b[1024];     // CHAR_BITS=8 so sizeof(int32_t) = 4

Maybe only faster for much larger arrays, since copying is O(N), while manipulating page tables has large fixed cost (system call, TLB shootdown across cores) but only a small cost per page touched, like 8 / 4096 of the amount of data actually manipulated. (8 bytes of page-table-entry per 4096 bytes of data, on x86-64 for example.) Or less with large/hugepages.

The page size is (much) larger than 4 bytes on every real-world system, so both those objects are in the same virtual page in your example. 4-byte page size would be completely impractical, taking about as much space for page-tables as for actual data, and needing a TLB larger than the caches. (A ~40-bit physical address for every 48-2 = 46-bit virtual page number, for every 4 bytes of address-space you want to cover. With Accessed, Dirty, and R/W/X permissions.)

Common page sizes range from 4kiB (x86-64) to 16k or 64k, with 4k being uncomfortably small (too many TLB entries needed to cover the large working-sets modern software often uses). Some systems support largepages / hugepages using a page-directory entry (higher up in the radix-tree) as one contiguous large page, e.g. x86-64's 2M / 1G large/hugepages.

It is in theory possible to ask an OS to re-map your virtual address space differently onto the same data in physical memory, e.g. to swap the contents of two whole virtual pages by just updating their page-table-entries (PTEs) to swapping the physical addresses. (And invaliding the TLB entries on the current and every other core: TLB shootdown.)

Linux does not AFAIK have an API to ask for a mapping-swap of two virtual pages, but it does have mremap(2). (mremap is Linux-specific. Other OSes may have something similar. ISO C++ doesn't require virtual memory, so doesn't have any functions to portably manipulate it).

With three mremap(MREMAP_FIXED) calls and a temporary virtual page (that you weren't using or that you know is unallocated), you can do a tmp=a / a=b / b=tmp swap, where a and b are the contents of whole (ranges) of pages.

#define _GNU_SOURCE
#include <sys/mman.h>

// swap contents of pa[0..size] with pa[0..size]
// effectively mmap(tmp, MAP_FIXED) then munmap(tmp, size)
// size must be a multiple of system page size, and pointers must be page-aligned
void swap_page_contents(void *pa, void *pb, void *tmp, size_t size)
{
    // need to force moving, otherwise kernel will leave it in place because we aren't growing.
    void *ret = mremap(pa, size, size, MREMAP_MAYMOVE|MREMAP_FIXED, tmp);
    assert(ret == tmp);  // t2 != MAP_FAILED
    ret = mremap(pb, size, size, MREMAP_MAYMOVE|MREMAP_FIXED, pa);
    assert(ret != MAP_FAILED);
    ret = mremap(tmp, size, size, MREMAP_MAYMOVE|MREMAP_FIXED, pb);
    assert(ret != MAP_FAILED);
}

You might allocate tmp with mmap(MAP_PRIVATE|MAP_ANONYMOUS). Lazy allocation means a physical page would never get allocated to back that mapping, and Linux will put it somewhere unused in your virtual address space. This swap ends up unmapping it, so maybe I should have put that inside this function. But if you can be sure your process hasn't mapped any new memory since the last swap, you can reuse the same tmp. It doesn't need to be mapped, you just need to know it's not in use for anything else.

This can fail with EINVAL if you pass bad args (not page-aligned or overlaps). So perhaps have it return an error instead of assert, although if b isn't aligned then it will fail after already moving a to tmp.

This is also not atomic or thread-safe: pa is temporarily unmapped, and temporarily we have pa and pb both pointing to the original contents of pb. MREMAP_DONTUNMAP doesn't really help with that; it only works on MAP_PRIVATE|MAP_ANONYMOUS mappings (e.g. like malloc would allocate, but of course you'll probably break malloc's bookkeeping if you round down to the start of a page and swap its metadata.) Also, DONTUNMAP makes the old mapping read as zeros, although the man page says you can install a handler with userfaultfd(2) to do something else (e.g. to assist garbage collection).

Apparently you can pass old_size=0 to get it to make another virtual mapping for the data, but only if the original mapping was a MAP_SHARED mapping. So you can't do this to make the kernel pick an unused page-range for tmp for arbitrary mappings, only shared (probably file-backed) mappings.

Linux also has remap_file_pages(2) which can duplicate a page mapping within a tmpfs file-backed mmap, although that syscall is deprecated and apparently always uses a "slower in-kernel emulation" instead of whatever it used to do. Regardless, I think it still can't swap, only create a 2nd mapping for one part of a file, within a larger mapping.

score 1 · Answer 3 · answered Jun 29 '21 at 08:07

Generally, no. The variable in a running process has no name. When you compile the source code, the compiler recognizes declarations of variables and plans their allocation in memory. Then each variable gets assigned its address (which can be constant if a variable is static, i.e. allocated once for the whole time of the process execution, or relative if the variable is automatic, i.e. created on each entry to some block of code and destroyed on exit). Then those addresses get built into the code of instructions which use the variables.

As a result the name A or B does no longer exist once the program gets built. When you run your program it internally uses for example an address 0x1004 to access a four-byte piece of data, but it doesn't know of any int B.

And swapping addresses of memory is impossible, despite making no sense. An address is more or less an ordinal number of a memory cell. You can't change the order of cells, just like you can't make a third house along the street to become a first one. Of course you could (in theory) swap the buildings, just like you can swap the memory cells' content, but the ordinal numbers of the cells will remain the same, similar to ordinal numers of buildings (or their parcels).

You can change how virtual addresses maps to data in physical memory. Linux has a system call for that, `mremap`. See my answer. But yes, +1 for addressing the question's idea of actually changing `&a`. — Peter Cordes, Jun 29 '21 at 18:26

Can we change virtual memory address of a block of data without accessing the values in it?

3 Answers3