7

Any suggestions/discussions are welcome!

The question is actually brief as title, but I'll explain why I need physical address.


Background:

These days I'm fascinated by cache and multi-core architecture, and now I'm quite curious how cache influence our programs, under the parallel environment.

In some CPU models (for example, my Intel Core Duo T5800), the L2 cache is shared among cores. So, if program A is accessing memory at physical address like

0x00000000, 0x20000000, 0x40000000...

and program B accessing data at

0x10000000, 0x30000000, 0x50000000...

Since these addresses share the same suffix, the related set in L2 cache will be flushed frequently. And we're expected to see two programs fighting with each other, reading data slowly from memory instead of cache, although, they are separated in different cores.

Then I want to verify the result in practice. In this experiment, I have to know the physical address instead of virtual address. But how can I cope with this?


The first attempt:

Eat a large space from heap, mask, and get the certain address.

My CPU has a L2 cache with size=2048KB and associativity=8, so physical addressess like 0x12340000, 0x12380000, 0x123c0000 will be related to the first set in L2 cache.

int HEAP[200000000]={0};
int *v[2];
int main(int argc, char **argv) {

    v[0] = (int*)(((unsigned)(HEAP)+0x3fffc) & 0xfffc0000);
    v[1] = (int*) ((unsigned)(v[0]) + 0x40000); 

    // one program pollute v[0], another polluting v[1]
}

Sadly, with the "help" of virtual memory, variable HEAP is not always continuous inside physical memory. v[0] and v[1] might be related to different cache sets.


The second attempt

access /proc/self/mem, and try to get memory information.

Hmm... seems that the results are still about virtual memory.

sleepsort
  • 1,321
  • 15
  • 28
  • I don't think you can specify physical address, at least not without kernel changes. But you can KNOW physical addresses at a bulk level, using /proc/self/maps – anishsane Dec 19 '12 at 06:49
  • @anishsane hmm, I dumped `/proc/self/maps`, but how is it related with physical addresses? The first field seem to be virtual address, so is the `dev` field related to memory device? – sleepsort Dec 19 '12 at 06:57
  • oh sorry, yes, first 2 fields are indeed virtual addresses... I had used it to get some virtual addresses corr to physical bar addresses of a PCI device I open... sorry for mislead... – anishsane Dec 19 '12 at 07:22
  • Operating at the physical address level is not easy on modern desktop CPUs. It is easy on small microcontrollers when programming assembler but you will not observe cache (the ones I know do not have caches) effects there. – Peter G. Dec 19 '12 at 08:26

1 Answers1

7

Your understanding of memory and these addresses is incomplete/incorrect. Essentially, what you're trying to test is futile.

In the context of user-mode processes, pretty much every single address you see is a virtual address. That is, an address that makes sense only in the context of that process. The OS manages the mapping of where this virtual memory space (unique to a process) maps to memory pages. These memory pages at any given time may map to pages that are paged-in (i.e. reside in physical RAM) - or they may be paged-out, and exist only in the swap file on disk.

So to address the Background example, those addresses are from two different processes - it means absolutely nothing to try and compare them. Whether or not their code is present in any of the caches depends on a number of things, including the cache-replacement strategy of the processor, the caching policies enabled by the OS, the number of other processes (including kernel-mode threads), etc.

In your first attempt, again you aren't going to get anywhere as far as actually testing CPU cache directly. First of all, your large buffer is not going to be on the heap. It is going to be part of the data section (specifically the .bss) of the executable. The heap is used for the malloc() family of memory allocations. Secondly, it doesn't really matter if you allocate some huge 1GB region, because although it is contiguous in the virtual address space of your process, it is up to the OS to allocate pages of virtual memory wherever it deems fit - which may not actually be contiguous. Again, you have pretty much no control over memory allocation from userspace. "Is there a way to allocate contiguous physical memory from userspace in linux?" The short answer is No.

/proc/$pid/maps isn't going to get you anywhere either. Yes there are plenty of addresses listed in there, but again, they are all in the virtual address space of process $pid. Some more information on these: How do I read from /proc/$pid/mem under Linux?

Community
  • 1
  • 1
Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • You're right, if we simply try to access a single address, it might be dispatched to other process. However, what I try to access is a certain type of physical address (like 0x????0000). It'll help if we know which position in the big array matches our condition. – sleepsort Dec 19 '12 at 07:08
  • "I try to access is a certain type of physical address" Nope. Whatever address you try to access from a usermode program is **not** a physical address. It is a **virtual address** and you have **no idea** where that's going to map to in the physical address space. – Jonathon Reinhart Dec 19 '12 at 07:10
  • Hmm, when we have enough resources (like more than 2 cores, much much memory), can we assume that os will not page-out/re-map the pages, and the environment will be stable enough? – sleepsort Dec 19 '12 at 07:13
  • The number of cores has nothing to do with paging in/out of memory. In fact, more CPU cores might make things *more* likely to page a process out. Certainly, if you have infinite RAM then your swap partition utilization will be virtually zero. But what does this gain you? If you want to look at cache performance, you can simply time the execution some code in a loop the first time through, and compare that to subsequent iterations. – Jonathon Reinhart Dec 19 '12 at 07:16
  • 1
    This question is becoming more and more off-topic. These are theoretical questions really, that could be answered much better by some research and reading up on how virtual memory and caches work etc. There doesn't appear to be an actual programming problem here. – Jonathon Reinhart Dec 19 '12 at 07:17