Why is forking slowing down my application

Question

My application takes a checkpoint every few 100 milliseconds by using the fork system call. However, I notice that my application slows down significantly when using checkpointing (forking). I tested the time taken by fork call and it came out to be 1 to 2 ms. So why is fork slowing down my application so much. Note that I only keep 1 checkpoint (forked process) at a time and kill the previous checkpoint whenever I take a new one. Also, my computer has a huge RAM.

Notice that my forked process just sleeps after creation. It is only awoken when rollback needs to be done. So, it should not be scheduled by the OS. One thing that comes to my mind is that since fork is a copy-on-write mechanism, there are page faults occuring whenever my application modifies a page. But should that slow down the application significantly? Without checkpointing (forking), my application finishes in approximately 3.1 seconds and with it, it takes around 3.7 seconds. Any idea, what is slowing down my application?

Forking is heavy weight and is having the OS create an entire new process. You should really use threads if you can. They are lightweight and are less intensive. — linuxuser27, Nov 29 '11 at 18:27
But my forked process just sleeps after creation. It is only awoken when rollback needs to be done. — MetallicPriest, Nov 29 '11 at 18:28
Check out this: http://stackoverflow.com/questions/3934992/fair-comparison-of-fork-vs-thread — linuxuser27, Nov 29 '11 at 18:30
After fork() every page you modify in either the child or parent will result in a page fault, which means a context switch and copying a page. So it'll directly depend on how much memory your application uses and how much of that you modify after a fork — nos, Nov 29 '11 at 18:33
@linuxuser27: threads are useless for a checkpoint mechanism without a ton of extra work, precisely because they *don't* copy-on-write the process address space (which is the thing that makes them lighter weight than `fork`). — zwol, Nov 29 '11 at 18:39
@Zack - I see. Thanks for the info. Then I guess the perf issue that MetallicPriest is experiencing is something that will just have to be accepted or is there another mechanism? — linuxuser27, Nov 29 '11 at 18:45
I am allowing my imagination to fill in a lot of details here, but what I *think* MetallicPriest is doing would take a whole lot of user space code to duplicate using any other kernel primitive I know about, and it might well end up being *slower*. — zwol, Nov 29 '11 at 18:51

score 8 · Accepted Answer · answered Nov 29 '11 at 18:37

You are probably observing the cost of the copy-on-write mechanism, as you hypothesize. That's actually quite expensive -- it is the reason vfork still exists. (The main cost is not the extra page faults themselves, but the memcpy of each page as it is touched, and the associated cache and TLB flushes.) It's not showing up as a cost of fork because the page faults don't happen inside the system call.

You can confirm the hypothesis by looking at the times reported by getrusage -- if this is correct, the extra time elapsed should be nearly all "system" time (CPU burnt inside the kernel). oprofile or perf will let you pin down the problem more specifically... if you can get them to work at all, which is nontrivial, alas.

Unfortunately, copy-on-write is also the reason why your checkpoint mechanism works in the first place. Can you get away with taking checkpoints at longer intervals? That's the only quick fix I can think of.

Basile Starynkevitch · Answer 2 · 2011-11-29T20:06:43.450

3

I suggest using oprofile to find out.

oprofile is believed to be able to profile a system (and not only a single process).

You could compare with what other checkpointing packages do, e.g. BLCR

edited Nov 29 '11 at 20:06

answered Nov 29 '11 at 18:31

Basile Starynkevitch

223,805
18
296
547

1

I don't think you should recommend profilers with a track record of being an incredible pain in the ass to persuade to do anything useful, without warning people of this. Unfortunately, that describes all the Linux kernel profilers. – zwol Nov 29 '11 at 18:40

score 2 · Answer 3 · answered Nov 29 '11 at 18:44

Forking is by nature very expensive, as you're creating a copy of the existing process as an entirely new process. If speed is important to you, you should use threads.

Additionally, you say that the forked process sleeps until a 'rollback' is needed. I'm not sure what you mean by rollback, but provided its something that you can put in a function, you ought to just place it in a function and then create a thread that just runs that function and exits when you detect the need for the rollback. As an added bonus, if you use that method you only create the thread if you need it.

Why is forking slowing down my application

3 Answers3