I think a real world example will clear the confusion, so let’s see how things are done in Linux.
First of all Linux doesn’t differentiate between process and thread, entity that can be scheduled is called task in Linux and represented by task_struct
. So whenever you execute a fork()
system call, a new task_struct
is created which holds data (or pointer) associated with new task.
So in Linux world a kernel thread means a task_struct
object.
Because scheduler only knows about these entities which can be assigned to different CPU’s (logical or physical). In other words if you want Linux scheduler to schedule your process you must create a task_struct
.
User thread is something that is supported and managed outside of kernel by some execution environment (EE from now on) such as JVM. These EE’s will provide you with some functions to create new threads.
But why a user thread must always be mapped to a specific kernel thread.
Let’s say you created some threads using your EE. eventually they must be executed by the CPU and from above explanation we know that the thread must have a task_struct
in order to be assigned to some CPU. That is why the mapping must exist. It’s the duty of your EE to create task_structs
.
If your EE uses many to one model then it will create only one task_struct
for all the threads and it will schedule all these threads onto that task_struct
. Think of it as there is one CPU (task_struct
) and many processes (threads created in EE), your operating system (the EE) will multiplex these processes on that single CPU.
If it uses one to one model than there will be one task_struct
for every thread created in EE. So when you create a new thread in your EE, corresponding task_struct
gets created in the kernel.
Windows does things differentlly ( process and thread is different ) but general idea stays the same that is kernel thread is the entity that CPU scheduler considers for assignment hence user threads must be mapped to corresponding kernel threads (if you want CPU to execute them).