Does a hyperthreading CPU implement parallelism or just concurrency?

Question

Does a hyperthreading CPU implement parallelism or just concurrency (context switching)?

My guess is no parallelism, only concurrency by context switching.

Yeah, but what is it? Or, I should say, something like: actually they're exactly the same thing, but it would be interesting to know why you think they're different? — Software Engineer, Oct 28 '15 at 11:07
@EngineerDollery Please read my answer below. You might find what you are looking for. :) — Swastik Padhi, Oct 30 '15 at 03:52
@EngineerDollery Also, if you believe that **parallelism** and **concurrency** mean the same, do check out the links- http://stackoverflow.com/questions/1050222/concurrency-vs-parallelism-what-is-the-difference and https://wiki.haskell.org/Parallelism_vs._Concurrency — Swastik Padhi, Oct 30 '15 at 03:55
@CrakC -- thanks for the references, but there are no authoritative references there apart from one to an old document by Sun, which I've read and don't find particularly compelling. The differences that have been discussed here are all personal opinions, not facts. The only fact we have that's generally accepted is the dictionary, which states that they are the same. There are plenty of other CS references that also state that they're the same. So, your argument is biased and opinionated, not factual. — Software Engineer, Oct 30 '15 at 13:46
I mean that in the nicest possible way -- I'm not trying to start a fight, just an argument. — Software Engineer, Oct 30 '15 at 13:47
@EngineerDollery yes an argument is always welcome. you see, I know about the dictionary thing and moreover, as you would have already found in the _haskell wiki_ link that not all programmers agree on the differences between parallelism and concurrency so it's pretty acceptable. :) — Swastik Padhi, Oct 31 '15 at 10:20
@EngineerDollery . But in my first comment when I said _there is indeed a great difference_, what I really meant is that there exists a difference between **parallelism** and **concurrency by context switching** (terms taken from the OP) because even though the words **parallelism** and **concurrency** have the same dictionary meaning, concurrency by **context switching** is not the same as **parallelism**. Context switching appears to be concurrent but in reality, it's not. That's the point I am trying to drive home here. Hope you can understand it now. — Swastik Padhi, Oct 31 '15 at 10:22

Swastik Padhi · Answer 1 · 2015-10-30T03:46:32.563

A single physical CPU core with hyperthreading appears as two logical CPUs to an operating system. The CPU is still a single CPU, so it’s “cheating” a bit — while the operating system sees two CPUs for each core, the actual CPU hardware only has a single set of execution resources for each core. The CPU pretends it has more cores than it does, and it uses its own logic to speed up program execution. Hyper-threading allows the two logical CPU cores to share physical execution resources. This can speed things up somewhat — for example, if one virtual CPU is stalled and waiting, the other virtual CPU can borrow its execution resources. Also, free resources can be utilized for simultaneous execution of other tasks. Hyper-threading can help speed your system up, but it’s nowhere near as good as having additional cores. Parallelism in its real sense (independent execution as in GPGPU architecture or multiple physical cores), is not attainable on a single-core processor unless you are considering a superscalar architecture.

From: https://en.wikipedia.org/wiki/Superscalar_processor

Superscalar processors differ from multi-core processors in that the several execution units are not entire processors. A single processor is composed of finer-grained execution units such as the ALU, integer multiplier, integer shifter, FPU, etc. There may be multiple versions of each execution unit to enable execution of many instructions in parallel. This differs from a multi-core processor that concurrently processes instructions from multiple threads, one thread per processing unit (called "core"). It also differs from a pipelined processor, where the multiple instructions can concurrently be in various stages of execution, assembly-line fashion.

From: http://www.cslab.ece.ntua.gr/courses/advcomparch/2007/material/readings/HYPERTHREADING%20TECHNOLOGY%20IN%20THE%20NETBURST%20MICROARCHITECTURE.pdf

Hyper Threading technology makes a single physical processor appear to be multiple logical processors. There is one copy of the architectural state for each logical processor, and these processors share a single set of physical execution resources. From a software or architecture perspective, this means operating systems and user programs can schedule processes or threads to logical processors as they would on conventional physical processors in a multiprocessor system. From a microarchitecture perspective, it means that instructions from logical processors will persist and execute simultaneously on shared execution resources. This can greatly improve processor resource utilization. The hyper threading technology implementation on the Netburst microarchitecture has two logical processors on each physical processor. Figure 1 shows a conceptual view of processors with hyperthreading technology capability. Each logical processor maintains a complete set of the architectural state. The architectural state consists of registers, including general-purpose registers, and those for control, the advanced programmable interrupt controller (APIC), and some for machine state. From a software perspective, duplication of the architectural state makes each physical processor appear to be two processors. Each logical processor has its own interrupt controller, or APIC, which handles just the interrupts sent to its specific logical processor.

Note: For simultaneous multithreading using a superscalar core (i.e., one that can issue more than one operation per cycle), the execution process is significantly different.

Simultaneous multithreading does not require one thread to stall for the other thread to be active (even fine-grained multithreading does not have this requirement). SMT requires a superscalar core (i.e., one that can issue more than one operation per cycle) since threads can execute operations *simultaneously*. FGMT allows concurrency within the pipeline (i.e., multiple threads can be active using different stages of the pipeline at the same time). (Incidentally, hyper-threading also refers to Itanium's switch-on-event-multithreading; for x86 it has only been used for SMT.) — , Oct 27 '15 at 02:37
@PaulA.Clayton Yes you are right but the point that I was trying to bring home is that on a single-core processor, the resources are always shared and the different threads executing on it are dependent on each other to a greater degree. — Swastik Padhi, Oct 27 '15 at 02:48
@CrakC it's misleading to say that the core only has a "single set of execution resources". Many hardware units are in fact replicated to enable SMT, just not as many as would be necessary for multiple cores. — hayesti, Oct 27 '15 at 13:51
@CrakC [Hyperthreading Technology in the Netburst Microarchitecture](http://www.cslab.ece.ntua.gr/courses/advcomparch/2007/material/readings/HYPERTHREADING%20TECHNOLOGY%20IN%20THE%20NETBURST%20MICROARCHITECTURE.pdf). Check Figure 1 and Figure 2. You need to have an architectural state replicated for each logical core in addition to several microarchitectural units. Many structures _can_ be shared between threads, but not everything (otherwise SMT would be the same as context switching). — hayesti, Oct 27 '15 at 14:59
@hayesti But the **replication** here is **logical**, isn't it? — Swastik Padhi, Oct 27 '15 at 15:04
@CrakC I don't understand what you mean when you say 'logical'. — hayesti, Oct 27 '15 at 15:29
@hayesti I read the file that you shared. It says almost the same thing as I had previously posted. Please read my updated answer. Does it make sense now? I think from there you can get an idea of what I am referring to as **logical**. :) — Swastik Padhi, Oct 30 '15 at 03:49
@CrakC I understand what you mean now. I suppose when Intel refers to "execution resources" they must actually be referring to issue queues and functional units. At times it can be a nightmare to decipher what they mean with their own terminology. — hayesti, Nov 09 '15 at 08:34
*not attainable on a single-core processor unless you are considering a superscalar architecture.* - All SMT CPUs I'm aware of *are* superscalar. Intel certainly hasn't made any CPUs with hyperthreading that can only run one instruction per cycle. You're right that SMT doesn't increase the available raw throughput, which is why people don't use it for many number-crunching HPC workloads where more threads mean more competition for cache but well-tuned matmul can already keep the FP ALUs busy every cycle with one thread per physical core. But work does happen in parallel — Peter Cordes, Feb 06 '23 at 21:35
Related: [Can a hyper-threaded processor core execute two threads at the exact same time?](https://stackoverflow.com/q/47446725) (yes) and [What is the difference between Hyperthreading and Multithreading? Does AMD Zen use either?](https://stackoverflow.com/q/74152562) (AMD's SMT is the same as Intel's Hyperthreading, just not using that trademark.) — Peter Cordes, Feb 06 '23 at 21:41

score 3 · Answer 2 · edited May 23 '17 at 12:15

Concurrency is a way of task execution. Its alternate is Sequential Execution.
Parallelism is a way of designing a task. Its alternate is Serial.
Hyper-threading is a hardware assisted execution mechanism wherein some parts of processor (i.e. the hardware) are duplicated to allow faster execution¹. Its alternate can be any old hardware of 90's (as hyper-threading first appeared in Feb '02²).

Without hyper-threading hardware, we can have concurrency provided there indeed are more than one task which can be executed concurrently. How? Take Process P₁ and P₂ which can be safely executed concurrently and take a core (called C). P₁ runs for 1 time quantum on C then P₂ runs for another time quantum on C followed by P₁ running for next time quantum on C and so on.

There was only one core C - There was no hyper-threading - and we had concurrent execution of P₁ and P₂.

Without hyper-threading hardware, we can have parallelism if there is a task which can be parallelly executed and we have more than one core to indeed run that task in parallel. Take Mapping part of Mapreduce.

Let's say you have two text files to read from, you have started two mappers and you have two non-hyperthreaded physical cores. In this case you can (and probably will) run the mappers in parallel without any hyper-threading. Each mapper will read from its own text file, will run on its own core and will generate its own mapped output.

There were 2 cores - There was no hyper-threading - and we had parallel execution of a task.

Conclusion: Hyperthreading is a hardware improvement and can be successfully disconnected from parallelism and concurrency.

^{¹ by reducing the amount of data needed to copy in order to effectively perform a context-switch.}

^{² It first appeared in February 2002 on Xeon server processors and in November 2002 on Pentium 4 desktop CPUs.}

^{^*A good SO answer about Parallelism and Concurrency states that Concurrency is like having a juggler juggle many balls. Regardless of how it seems, the juggler is only catching/throwing one ball at a time. Parallelism is having multiple jugglers juggle balls simultaneously.}

Does a hyperthreading CPU implement parallelism or just concurrency?

2 Answers2

Linked