0

I am interested in memory bottleneck, and trying to figure out when variables are loaded into L1 data cache.

Is there any helpful tool to monitor the content in L1 cache? I want to know whether a varaible has been cached or not.

For example, here's simple funtion:

void fun(){
    char c1_loc='a';
    int i_loc=1;

    char c2_loc='b';
    double d_loc=2;

    char c3_loc='c';
    float f_loc=3;

    char c4_loc='d';
    short s_loc=4;
}

memory layout of this function:
memory layout of this function

I could get memory layout of the function with GDB, but how could I know which variables are in the same cache-line? In other words, I want to know from where cache-line start to load.

My machine is Intel-Core-I7-10710U, and the L1 data cache is 32KB, cacheline is 64B.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Roilio
  • 1
  • 1
  • 1
    this is highly platform dependent you should add tag of used CPU !!! I think the answer could be found in datasheed of used CPU (try to download one use the biggest in size the small ones ~2MByte are usually just electrical infosheets the big ones ~13MByte and more usually contains also ISET and programming reference) – Spektre May 05 '22 at 07:34
  • Thanks for your advice. My CPU is Intel-Core-I7-10710U, and the L1 data cache is 32KB, cacheline is 64B. – Roilio May 05 '22 at 07:49
  • from a quick look `10th Generation Intel® Core™ Processors Datasheet Volume 2 of 2` from [here](https://www.intel.com/content/www/us/en/products/sku/196448/intel-core-i710710u-processor-12m-cache-up-to-4-70-ghz/docs.html?s=Newest&p=2) hold some explanation to memory controler registers (just look for "cache" in text lowercase ...) so study that and test ... once you acquire the used memory ranges then just check if your variable pointer is within the range or not ... If the datasheet does not help IIRC there was also some universal intel datasheet to IA64 architecture (that one was huge) – Spektre May 05 '22 at 09:33
  • however I am afraid to access those registers you would need priviledged access ... there are drivers for that like DLLPORTIO however you have to find something newer that one was for w9x ... and beware such drivers are huge security risc unless password protected so do not use it with internet / LAN on ... as they really allow to do anything ... IIRC the more recent drivers was used for accessing LPT (password protected) as MS screw that up decades ago and did not remedy at all till today... – Spektre May 05 '22 at 09:34
  • **Cache lines are 64B wide and *naturally aligned***, so every byte with the same `addr / 64` is in the same line. So for a given invocation of a function where you can see those addresses, it's trivial: just check if the addresses are the same outside the low 6 bits. But unless your function aligns the stack by 64 (e.g. for an `alignas(64)` local that doesn't get optimized away), you don't know where cache-line boundaries are going to be in general. And of course with optimization enabled, none of those local vars would get stored to memory at all. – Peter Cordes May 05 '22 at 10:31
  • Related: [What Every Programmer Should Know About Memory?](https://stackoverflow.com/a/8126441) – Peter Cordes May 05 '22 at 10:31
  • Also, I *hope* ESP and EBP don't have odd addresses like you're showing, pointing at `0x...01`. Compilers typically keep ESP aligned by 16, or at least by 4, so a dword push never splits between two cache lines or pages. And EBP, if used as a traditional frame-pointer, will also be at least 4-byte aligned, not an odd address. – Peter Cordes May 05 '22 at 10:34
  • You are right, I made some mistake about the address. Thanks for your comment. – Roilio May 06 '22 at 05:37

0 Answers0