I read that the prefetch queue of a CPU can affect the execution of a program and can lead to undesired deviations from expected behaviour (erroneous results).
Is there any method to avoid the above, apart from spacing your commands which affect each other and pray for the best? If not, how much spacing is required spacing?
Are there any other features of x86 family (such as Caches, Pipelines, Superscalar design) which can negatively affect a program? I do not refer to timing (as is the case with pipeline hazards) but to wrong results.
Edit: You all reply that CPU optimizations do not affect correctness, only speed. I am troubled now. For example, in Wikipedia it is claimed that this code will not execute as planned. In addition there are anti debugging tricks, as well as techniques for calculating queue's length, which may well be utilized in order to determine CPU model.