2

For a class I'm taking I've been doing some work directly with the clone() system call in Linux. I got curious about how it actually worked and started doing some digging. What is confusing me is that it seems to rely on some of the same underpinnings as fork() functionality (they call the same do_fork() function albeit with different arguments). On one hand, this makes sense to me as a thread is really just a light-weight process but I was always under the impression that there were some significant differences between the way a thread was created an the way a process was created. I did some digging into the implementation of do_fork() and subsequently copy_process() (which do_fork() calls) but I haven't been able to convince myself I understand what's going on.

So, to the guru's out there, am I missing something or is this actually how it works? Are there flags that basically tell the OS just how much to copy as well as what instruction to begin execution of the new task at (I'm thinking the answer has to be yes, but I'm just not sure how they translate)?

Below is the code I'm looking at, perhaps you could explain how the arguments that are passed in control whether a light-weight or heavy-weight process is created.

asmlinkage int sys_fork(struct pt_regs *regs){
  #ifdef CONFIG_MMU
      return do_fork(SIGCHLD, regs->ARM_sp, regs, 0, NULL, NULL);
  #else
    /* can not support in nommu mode */
    return(-EINVAL);
  #endif
}


asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
             int __user *parent_tidptr, int tls_val,
             int __user *child_tidptr, struct pt_regs *regs)
{
    if (!newsp)
        newsp = regs->ARM_sp;

    return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
}

Thanks!

Chris Thompson
  • 35,167
  • 12
  • 80
  • 109
  • My earlier answer http://stackoverflow.com/questions/807506/threads-vs-processes-in-linux/809049#809049 may interest you. – ephemient Nov 12 '10 at 06:26

3 Answers3

4

Actually, at the conceptual level, the Linux kernel doesn't know anything about processes or threads, it only knows about "tasks".

A Linux task can be a process, a thread or something in between. (Incidentally, this means that the strange children that vfork() creates fit perfectly well into the Linux "task" paradigm).

Now, tasks can share several things, see all the CLONE_* flags in the manpage for clone(2). (Not all these flags can be described as sharing, some specify more complex behaviours).

Or new tasks can choose to have their own copies of the respective resources. And since 2.6.16, they can do so after having been started, see unshare(2).

For instance, the only difference between a vfork() and a fork() call, is that vfork() has CLONE_VM and CLONE_VFORK set. CLONE_VM makes it share its parent's memory (the same way threads share memory), while CLONE_VFORK makes the parent block until the child releases its memory mappings (by calling execve() or _exit()).

Note that Linux is not the only OS to generalize processes and threads in this manner. Plan 9 has rfork().

ninjalj
  • 42,493
  • 9
  • 106
  • 148
3

Nothing in the clone manpage suggests that it's "lightweight".

The critical difference is that fork creates a new address space, while clone optionally shares the address space between the parent and child, as well as file handles and so forth.

This shared address space enables lightweight IPC later on, but the process itself is not slimmer.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Hmm, interesting. I understand the notion of creating a new address space vs sharing the address space but I had always heard of threads referred to as "light-weight processes" because they don't have their own address space, rather they share that of the parent and with all other threads created in the same manner. Is that an incorrect distinction? Or is it one that's made that isn't technically correct, but done to help people who are learning understand it. – Chris Thompson Nov 11 '10 at 19:19
  • @Chris: Linux has always had the goal of processes and task schedule switch being as light and as fast as possible. This means that a thread cannot do it any faster than a process. Some other OSes do less book-keeping for threads and so they are faster (or processes are slower, take your pick). – Zan Lynx Nov 11 '10 at 20:25
  • @Chris: NPTL gives linux lightweight threads. `clone`-based threads aren't especially lightweight. I'm not sure whether there was a thread technology in between. – Ben Voigt Nov 15 '10 at 03:00
  • NPTL **is** `clone`-based threads. Before that was LinuxThreads, which was `fork`-based "threads". – ephemient Nov 26 '10 at 06:10
1

I understand that the difference between all the three clone,fork and vfork is in the flags because finally all the three calls the do_fork() in kernel

fork()-->C_lib-->sys_fork()-->do_fork()

vfork()-->C_lib-->sys_vfork()-->do_fork()

clone()-->C_lib-->sys_clone()-->do_fork()

The difference between the fork and vfork is that vfork guarantees that child will execute first and parent will block until child calls exit or exec. vfork passes extra flag that is CLONE_VM, this flag ask kernel not to duplicate the page table, the reason is simple the child will either do exit or exec, if child exits nothing would be done, if child does the exec the page table will definitely be changed. i hope the fork and vfork flags are clear now at kernel level. Now lets look at the clone flags

The main usage of clone is to implement thread, where the memory space shared other then stack. Along with same parameter as fork and vfork the clone also takes the function pointer as parameter, which is called as soon as the child process is created.

Yusuf Khan
  • 409
  • 3
  • 9