What I'm wondering about (and what documentation I find is not very helpful in figuring it out), is what happens to a CPU core when the Thread that is executing on it transfers control to hardware device stuff (disk controller, network I/O, ...) to do some stuff that the CPU/core cannot help with. Does that core become available for executing other Threads, or does it just stall and wait (even if there are other Threads with CPU work to do that are available for scheduling) ?
The oft-given advice of "as many Threads as cores" seems to suggest the latter.