Using more software threads than CPUs = oversubscribing?

Question

The output of lscpu gives (partial output included):

CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    1
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2

I just want to confirm that my understanding is correct:

(1) I have 12 CPU(s)/cores. This number is also the number of HARDWARE threads that I have.

(2) If (1) is true, and, say, I run a code that uses more than 12 SOFTWARE threads, this would lead to oversubscription. Say that I use 13 software threads, would that mean that guarantee that 1 of my software threads cannot run concurrently with the other 12?

Yes. Using 13 threads when you have 12 cores available would guarantee that not all of them can run concurrently. How could they? What would the extra thread run on concurrently if all the cores are being used? — Ken White, Apr 06 '19 at 01:08
@KenWhite Yes, I am just trying to confirm my understanding. I've been struggling with some of the terminology such as processes, threads, cores, cpus, sockets, processors. — 24n8, Apr 06 '19 at 01:09
@KenWhite Also, in my example , if say, I had 2 threads per core, this would now give me 24 CPUs. In this case, I would be able to run up to 24 threads without oversubscribing, or am I limited to the number of cores (which is still 12)? — 24n8, Apr 06 '19 at 01:10
No, 2 threads per core does not give you 24 CPUs. You still have 12 CPUs that are running 2 threads each. Running two threads on a CPU does not magically cause the number of CPUs to multiply. It gives you 24 threads running on 12 cores, where each core is context-switching between the threads it is executing. It's still one core. I have 12 cores on my current machine, and according to Windows Task Manager those 12 cores are executing 3156 threads, but I still only have 12 cores. — Ken White, Apr 06 '19 at 01:19
@KenWhite Are we talking about 2 hardware threads per core? If so, then I am confused because in the second answer in https://unix.stackexchange.com/questions/218074/how-to-know-number-of-cores-of-a-system-in-linux, the answer states `CPUs = Threads per core X cores per socket X sockets` — 24n8, Apr 06 '19 at 01:27
That post is referring to something specific. From the post: *CPUs are what you see **when you run htop (these do not equate to physical CPUs)**.* I have no idea what you mean when you write *hardware threads*; threads are typically implemented in software and executed on hardware (the CPU core). I have a Xeon processor in my laptop that has 6 physical cores x 2 logical processors per core, which means that I effectively have 12 cores (CPUs) on my machine. Windows says those 12 cores are executing more than 3000 threads, which clearly means they're not all running concurrently. — Ken White, Apr 06 '19 at 01:34
@KenWhite I think I need to look into some of the fundamental definitions some more. Also, I got my definition of hardware thread from the second answer here https://stackoverflow.com/questions/5593328/software-threads-vs-hardware-threads. So in my OP's `lscpu` output, when it says `Thread(s) per core: 1`, I assume this is referring to the hardware thread. I don't quite understand what is the benefit of having more than 1 thread per core. — 24n8, Apr 06 '19 at 01:45
You don't undestand the benefit? How about the benefit seen when my OS (Windows 10 64-bit) is executing 3000+ threads (things being done) on 12 logical CPUs? What do you think would happen if it was restricted to one thread per CPU and could only do 12 things at a time? Scrolling, screen updates, multiple tabs loading at the same time in your web browser, file downloads, etc., would all be impossible. Think about running a game, and how that would function if it had to run on a single thread. Think about this site, with only one thread per CPU. How many users would it support at once? — Ken White, Apr 06 '19 at 02:00
@KenWhite I'm really confused. I thought you had said earlier that if you had multiple hardware threads per core, the core would be context-switching between those hardware threads on that core. Doesn't this mean that those hardware threads can't be running concurrently? By hardware threads per core, I mean the `Thread(s) per core: ` output from `lscpu`. So say you have 2 hardware threads per core, `lscpu` would output `Thread(s) per core: 2` — 24n8, Apr 06 '19 at 02:19
I never said a thing about *hardware threads*. You keep using that term. I've never heard it before. Threads are implemented in software, and execute on the CPU core(s). If you use more than one thread per core, there is context switching involved (the CPU has to save information for one thread, switch to the other (loading any context information), execute that thread, save the context, and switch back to the first thread. If you have 12 cores, they can all run concurrently (one per core). If you have 13, one core has to execute two threads as I just described. All I've said is there to see. — Ken White, Apr 06 '19 at 02:22
@KenWhite Yes, I understand that part. I think maybe the confusion is what the output `Thread(s) per core:` means. Is this what you were referring to earlier as "logical processors per core?" This is what I was referring to as hardware threads (per core). — 24n8, Apr 06 '19 at 02:24
Think of it this way: You have two bicycles. Each is a tandem (two seather), so each bicycle can carry two riders at once, which means you can have 4 riders concurrently (2 per bike * 2 bikes). If a new rider shows up, one of the previous riders is going to have to take turns with them. Logical vs physical: Think of a single hard disk that can be either one drive (mount point) or split into multiple partitions (multiple logical disks on a single physical disk). With the bike example, you have four logical seats but two physical bikes. — Ken White, Apr 06 '19 at 02:25
@KenWhite I think I got it. The 2 bicycles here are analogous to the physical cores, and the number of seats per bike is analogous to the `Thread(s) per core:` output? — 24n8, Apr 06 '19 at 02:30

score 1 · Accepted Answer · answered Apr 07 '19 at 15:44

1

I think there is some terminology confusion. I presume that you meant parallelism because concurrency is not a parallelism (it's a broader conception) which implies multitasking where a system can execute multiple tasks at the same time. This can be achieved by pure parallelism, preemptive multitasking or cooperative multitasking but they all are just forms of concurrency. Logically a multi-threading execution can be represented as a single-threaded sequence of fine grained activities with arbitrary order, which still imposes the concurrency problems regardless of fact that we execute all of our tasks on a single core or in a single thread. So all 13 threads or more on your system with 12 hardware threads will run concurrently but only 12 will run in parallel.

As a side note, hardware threads or logical CPUs aren't logical software threads but number of execution flows a CPU core can do simultaneously by means of Hyper-Threading so they also provide parallelism but with some limitations (in your case however each core has single logical CPU, so no hyper-threading).

answered Apr 07 '19 at 15:44

Dmytro Mukalov

1,949
1
9
14

I didn't know there was a difference between the 2. I think I was referring to parallelism. In the example that I gave, I guess that means 13 threads are running concurrently, and 12 running in parallel? – 24n8 Apr 07 '19 at 17:49
Also, do you know if a lot of people use concurrency to mean parallelism? I feel like I've seen a lot of posts using the word "concurrency," when the correct term may have been "parallelism," – 24n8 Apr 07 '19 at 17:55
Just saw this question on quora https://www.quora.com/What-is-the-difference-between-concurrency-and-parallelism. According to the first answer, it seems that the answer to my above comment is yes. – 24n8 Apr 07 '19 at 18:03
@lamanon, Parallelism is a form of concurrency, so that's why probably you've seen it mentioned in that way. As for your first comment, yes, that's what I highlighted in the answer above. – Dmytro Mukalov Apr 07 '19 at 18:07
Another vibe I got from reading some other posts is that concurrency is about multiply processes running concurrently. "Processes" imply distributed, and not in a shared-memory setting. So does this mean that "concurrency" is generally a term used in non-shared memory settings? – 24n8 Apr 07 '19 at 18:12
1

The fact the processes have or don't have shared-memory doesn't have an influence on their concurrency. Each computation process has start and end points in time. If the processes have timeline intersection at any period of their execution it's said that they run concurrently (even if this intersection is simulated "artificially" by time-slicing or any other technique). – Dmytro Mukalov Apr 07 '19 at 18:22

Using more software threads than CPUs = oversubscribing?

1 Answers1