51

What is the basic difference between a pthread and fork w.r.t. linux in terms of implementation differences and how the scheduling varies (does it vary ?)

I ran strace on two similar programs , one using pthreads and another using fork, both in the end make clone() syscall with different arguments, so I am guessing the two are essentially the same on a linux system but with pthreads being easier to handle in code.

Can someone give a deep explanation?

EDIT : see also a related question

Community
  • 1
  • 1
srinathhs
  • 1,998
  • 4
  • 19
  • 33
  • I listed some useful differences here: http://stackoverflow.com/questions/3609469/what-are-the-thread-limitations-when-working-on-linux-compared-to-processes-for-n/3705919#3705919 – Matt Joiner Apr 01 '11 at 23:10

3 Answers3

84

In C there are some differences however:

fork()

  • Purpose is to create a new process, which becomes the child process of the caller

  • Both processes will execute the next instruction following the fork() system call

  • Two identical copies of the computer's address space,code, and stack are created one for parent and child.

Thinking of the fork as it was a person; Forking causes a clone of your program (process), that is running the code it copied.


pthread_create()

  • Purpose is to create a new thread in the program which is given the same process of the caller

  • Threads within the same process can communicate using shared memory. (Be careful!)

  • The second thread will share data, open files, signal handlers and signal dispositions, current working directory, user and group ID's. The new thread will get its own stack, thread ID, and registers though.

Continuing the analogy; your program (process) grows a second arm when it creates a new thread, connected to the same brain.


Performance Differences

Forked processes do not share memory space and other resources (such as file handles) with the parent process by default, while threads within the same process share these resources. Sharing memory can be more efficient and faster in terms of inter-thread communication, but it requires careful synchronization to prevent race conditions.

Because threads within a process share memory space, they can be more memory efficient than forked processes, which each have their own memory space

Gabriel Fair
  • 4,081
  • 5
  • 33
  • 54
15

On Linux, the system call clone clones a task, with a configurable level of sharing. fork() calls clone(least sharing) and pthread_create() calls clone(most sharing). forking costs a tiny bit more than pthread_createing because of copying tables and creating COW mappings for memory.

Garrett Hyde
  • 5,409
  • 8
  • 49
  • 55
Davood Hanifi
  • 1,234
  • 1
  • 13
  • 19
7

You should look at the clone manpage.

In particular, it lists all the possible clone modes and how they affect the process/thread, virtual memory space etc...

You say "threads easier to handle in code": that's very debatable. Writing bug-free, deadlock-free multi-thread code can be quite a challenge. Sometimes having two separate processes makes things much simpler.

Mat
  • 202,337
  • 40
  • 393
  • 406
  • So, In the end linux handles both pthreads and fork in the same way and schedules them in the same way? – srinathhs Apr 01 '11 at 14:45
  • yes in general. this does not mean that you can't have different scheduling policies, or that a specific scheduler could apply different settings to thread groups vs plain old processes. (`fork` is implemented via the `clone` syscall btw.) – Mat Apr 01 '11 at 14:53
  • 2
    That's why God created Erlang. – Farshid Ashouri May 08 '19 at 00:10