I'm trying to understand the meaning of the perf events: dTLB-loads and dTLB-stores?
-
Do you know the difference between a load vs. a store operation? e.g. `add [rdi], eax` is both a load and a store. (And each part executes separately, so both would be separate TLB references). – Peter Cordes May 16 '19 at 16:12
-
I do understand in general that load fetches from memory and store writes back to memory, but I'm confused about the meaning of a TLB-store. From what I've read TLB entry can get modified (esp. dirty bit or valid bit changed), but where does the store happen to and why? – agood May 16 '19 at 16:34
2 Answers
A TLB-store isn't a write to the TLB, it's a write to a virtual address in main memory which has to read a TLB entry.
So a TLB-store is a TLB-reference that's done by a store operation.

- 328,167
- 45
- 605
- 847
When virtual memory is enabled, the virtual address of every single memory access needs to be looked up in the TLB to obtain the corresponding physical address and determine access permissions and privileges (or raise an exception in case of an invalid mapping). The dTLB-loads
and dTLB-stores
events represent a TLB lookup for a data memory load or store access, respectively. The is the perf
definition of these events. but the exact meaning depends on the microarchitecture.
On Westmere, Skylake, Kaby Lake, Coffee Lake, Cannon Lake (and probably Ice Lake), dTLB-loads
and dTLB-stores
are mapped to MEM_INST_RETIRED.ALL_LOADS
and MEM_INST_RETIRED.ALL_STORES
, respectively. On Sandy Bridge, Ivy Bridge, Haswell, Broadwell, Goldmont, Goldmont Plus, they are mapped to MEM_UOP_RETIRED.ALL_LOADS
and MEM_UOP_RETIRED.ALL_STORES
, respectively. On Core2, Nehalem, Bonnell, Saltwell, they are mapped to L1D_CACHE_LD.MESI
and L1D_CACHE_ST.MESI
, respectively. (Note that on Bonnell and Saltwell, the official names of the events are L1D_CACHE.LD
and L1D_CACHE.ST
and the event codes used by perf
are only documented in the Intel manual Volume 3 and not in other Intel sources on performance events.) The dTLB-loads
and dTLB-stores
events are not supported on Silvermont and Airmont.
On all current AMD processors, dTLB-loads
is mapped to LsDcAccesses
and dTLB-stores
is not supported. However, LsDcAccesses
counts TLB lookups for both loads and stores. On processors from other vendors, dTLB-loads
and dTLB-stores
are not supported.
See Hardware cache events and perf for how to map perf
core events to native events.
The dTLB-loads
and dTLB-stores
event counts for the same program on different microarchitectures can be different not only because of differences in the microarchitectures but also because the meaning of the events is itself different. Therefore, even if the microarchitectural behavior of the program turned out to be the same on the microarchitectures, the event counts can still be different. A brief description of the native events on all Intel microarchitectures can be found here and a more detailed description on some of the microarchitectures can be found here.

- 22,259
- 3
- 54
- 95