In a previous question, it's established that an AVX-512 masked load won't cause a page fault if the readmask bits are zero for each of the unmapped bytes.
Does the same apply for store forwarding failures? If a masked load comes immediately after a store, but the store only overlaps with bytes in the load for which the readmask bits are zero, will we see a store-forward penalty?
Under what circumstances does an AVX-512 masked load trigger a store-forwarding failure? Is it any different than a non-masked load of the same size?
I'm interested in Skylake-X in particular, but would appreciate any pointers!