According to Intel's Optimization Reference Manual, it depends on the processor. From section 7.4.3:
There are cases where a PREFETCH will not perform the data prefetch. These include:
- PREFETCH causes a DTLB (Data Translation Lookaside Buffer) miss. This applies to Pentium 4
processors with CPUID signature corresponding to family 15, model 0, 1, or 2. PREFETCH
resolves DTLB misses and fetches data on Pentium 4 processors with CPUID signature
corresponding to family 15, model 3.
- An access to the specified address that causes a fault/exception.
Software prefetching may or may not avoid TLB misses, depending on the processor. It will not fetch the data if it would cause a page fault.
If you want ensure you avoid TLB misses, you could do a dummy read to load the data instead of a prefetch instruction. This could cause a page fault to swap in a page, which could be either good or bad depending on your use case.