There was a discussion at work related to hyperthreaded Xeon machines. My (superficial) understanding on how Hyperthreading works is that the CPU is physically multiplexing the instructions coming from two "threads". That is, the execution units are shared, but there are two different architectural sets (register sets, instruction queues, maybe even branch-predictors etc) -- one for each thread. The execution units and their buffers / queues are always ready to receive new instructions / data, and that there is no advantage from this angle in disabling one of the threads instead of keeping both.
My colleague was implying that by shutting down hyperthreading we could achieve a speedup, as the CPU running the single thread no longer has to "look" to see if the other thread also has some work to do. My understanding is that all this circuitry is already hardwired to multiplex incoming data/instructions from both threads and that disabling hyperthreading is just going to shut off one of the threads, disallowing it to receive any instructions / data, but that actually nothing else differs. Is this a good mental model of how hyperthreading works?
I do understand that there are a multitude of factors at play, such as the memory working sets, the problem of shared caches, etc, that may influence how well a 2-thread hyperthreaded CPU behaves vs the same CPU with hyperthreading disabled, but my question is more directed towards if disabling hyperthreading somehow makes the whole flow of data / instructions through the pipeline faster or not? Can there be problems of contention when trying to fill up the buffers at the head of the backend, for instance?
My colleague's explanation also somehow included hypervisors, but I fail to see the relation between both? They seem to be orthogonal concepts.
Thanks!