7

On an embedded platform (with no swap partition), I have an application whose main process occupies most of the available physical memory. The problem is that I want to launch an external shell script from my application, but using fork() requires that there be enough memory for 2x my original process before the child process (which will ultimately execl itself to something much smaller) can be created.

So is there any way to invoke a shell script from a C program without incurring the memory overhead of a fork()?

I've considered workarounds such as having a secondary smaller process which is responsible for creating shells, or having a "watcher" script which I signal by touching a file or somesuch, but I'd much rather have something simpler.

Drew Hall
  • 28,429
  • 12
  • 61
  • 81
kdt
  • 27,905
  • 33
  • 92
  • 139
  • See Greg Hewgill's answer below. This is somewhat dependent on the platform - could you please elaborate on what platform you are using (e.g. does it have an MMU?) – ConcernedOfTunbridgeWells Apr 29 '10 at 08:14
  • Isn't it a dupe of a question asked *yesterday*?? http://stackoverflow.com/questions/2731531/faster-forking-of-large-processes-on-linux – P Shved Apr 29 '10 at 08:20
  • Yes and no, @Pavel, this one's not Linux-specific and it has extra info re what's being exec'ed - a shell script. The other questioner may have had the option of rewriting their application to use threads (if the exec'ed program was an executable under their control rather than bash/ksh/other-shell) but probably not in this case. – paxdiablo Apr 29 '10 at 08:31
  • @paxdiablo, ok, so if it's not a dupe, than I can answer this question as well... but with the very same answer. How come? – P Shved Apr 29 '10 at 08:39
  • "Should I write my web server in COBOL?" and "Should I write my accounting package in assembler?" also have the same answer (a resounding "No!!") but they're very distinct questions, even though they're both a the _form_ "Should I choose an inappropriate language for developing my application?". I contend that "dupiousness" is a property of the question, not the answer. In any case, even thought the answer _you_ may give is the same, that doesn't mean the set of possible answers is identical. Example: under Linux, vfork has no advantages, not so for many UNIXes. – paxdiablo Apr 29 '10 at 08:44
  • @paxdiablo, oh, that's UNIX... I'm too used to reading it as "Linux". – P Shved Apr 29 '10 at 08:55

5 Answers5

8

Some UNIX implementations will give you a vfork (part of the Single UNIX spec) which is exactly like fork except that it shares all the stuff with the parent.

With vfork, there are a very limited number of things you can do in the child before calling exec to overwrite the address space with another process - that's basically what vfork was built for, a minimal copy version of fork for the fork/exec sequence.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
6

If your system has an MMU, then usually fork() is implemented using copy-on-write, which doesn't actually allocate more memory at the time fork() is called. Additional memory would only be allocated if you write to any of the pages shared with the parent process. An exec() would then discard those pages.

If you know you don't have an MMU, then perhaps fork() is indeed implemented using an actual copy. Another approach might be to have a helper process that is responsible for spawning subshells, which you communicate with using a pipe.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • I've tested this on a normal linux 2.6 system which configured with no swap partition. When my parent process is about 300 megs, I find that I need at least 200 mb of free memory as reported by top in order to use fork(). Any less and it returns an error. I think the COW only happens if you've got a swap partition that lets linux believe that although it doesn't have to copy the pages until you write, it could if you did. – kdt Apr 29 '10 at 08:32
  • 3
    I think it depends on how overcommit is set up. Linux still does COW, but it counts the pages against the available VM, and will fail it if the result would exceed virtual memory (in case those pages are subsequently modified and need to be copied). Having swap just enables the kernel to consider the available VM to be bigger. – MarkR Aug 20 '10 at 10:39
0

I see you've already accepted an answer, but you may want to read about posix_spawn and use if it if it's available on your target:

http://www.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
-1

It sounds as if the prudent move in this case is to port your shell script (if possible) to C, and executing it within the process; so you don't have to fork at all.

Then again; I don't know what you are actually trying to do.

Williham Totland
  • 28,471
  • 6
  • 52
  • 68
-3

Instead of forking your process to spawn a shell, launch a shell within your process (in foreground) then fork it within the shell.

 system("/bin/ash /scripts/bgtask");

with /scripts/bgtask being:

 /bin/ash /scripts/propertask &

This way you double only the memory used by the shell, not by the main program. Your main program goes busy for duration of spawning the two shells: original to start bgtask and the background clone launched by it, then the memory allocated by the first shell is free again.

SF.
  • 13,549
  • 14
  • 71
  • 107
  • 1
    I think you may find that system is implemented as fork/exec/wait, which will give you the same problem as the question sought to solve. – paxdiablo Apr 29 '10 at 08:41
  • paxdiablo is correct, system() uses fork() internally. The man page event mentions it obliquely: "...e.g. fork() failed...": http://linux.die.net/man/3/system – kdt Apr 29 '10 at 12:20