3

I'm trying to understand the meaning of the perf events: dTLB-loads and dTLB-stores?

Hadi Brais
  • 22,259
  • 3
  • 54
  • 95
agood
  • 50
  • 1
  • 7
  • Do you know the difference between a load vs. a store operation? e.g. `add [rdi], eax` is both a load and a store. (And each part executes separately, so both would be separate TLB references). – Peter Cordes May 16 '19 at 16:12
  • I do understand in general that load fetches from memory and store writes back to memory, but I'm confused about the meaning of a TLB-store. From what I've read TLB entry can get modified (esp. dirty bit or valid bit changed), but where does the store happen to and why? – agood May 16 '19 at 16:34

2 Answers2

6

A TLB-store isn't a write to the TLB, it's a write to a virtual address in main memory which has to read a TLB entry.

So a TLB-store is a TLB-reference that's done by a store operation.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
6

When virtual memory is enabled, the virtual address of every single memory access needs to be looked up in the TLB to obtain the corresponding physical address and determine access permissions and privileges (or raise an exception in case of an invalid mapping). The dTLB-loads and dTLB-stores events represent a TLB lookup for a data memory load or store access, respectively. The is the perf definition of these events. but the exact meaning depends on the microarchitecture.

On Westmere, Skylake, Kaby Lake, Coffee Lake, Cannon Lake (and probably Ice Lake), dTLB-loads and dTLB-stores are mapped to MEM_INST_RETIRED.ALL_LOADS and MEM_INST_RETIRED.ALL_STORES, respectively. On Sandy Bridge, Ivy Bridge, Haswell, Broadwell, Goldmont, Goldmont Plus, they are mapped to MEM_UOP_RETIRED.ALL_LOADS and MEM_UOP_RETIRED.ALL_STORES, respectively. On Core2, Nehalem, Bonnell, Saltwell, they are mapped to L1D_CACHE_LD.MESI and L1D_CACHE_ST.MESI, respectively. (Note that on Bonnell and Saltwell, the official names of the events are L1D_CACHE.LD and L1D_CACHE.ST and the event codes used by perf are only documented in the Intel manual Volume 3 and not in other Intel sources on performance events.) The dTLB-loads and dTLB-stores events are not supported on Silvermont and Airmont.

On all current AMD processors, dTLB-loads is mapped to LsDcAccesses and dTLB-stores is not supported. However, LsDcAccesses counts TLB lookups for both loads and stores. On processors from other vendors, dTLB-loads and dTLB-stores are not supported.

See Hardware cache events and perf for how to map perf core events to native events.

The dTLB-loads and dTLB-stores event counts for the same program on different microarchitectures can be different not only because of differences in the microarchitectures but also because the meaning of the events is itself different. Therefore, even if the microarchitectural behavior of the program turned out to be the same on the microarchitectures, the event counts can still be different. A brief description of the native events on all Intel microarchitectures can be found here and a more detailed description on some of the microarchitectures can be found here.

Related: how to interpret perf iTLB-loads,iTLB-load-misses.

Hadi Brais
  • 22,259
  • 3
  • 54
  • 95