In general, how expensive is calling an external program?

Question

I know external programs can be called, but I don't know how expensive it is compared to, say, calling a subroutine. By the cost of calling, I mean the overhead of starting the program, rather than the cost of executing the program's code itself. I know the cost probably varies greatly depending on the language and operating system used and other factors, but I would appreciate some ballpark estimates.

I am asking to see the plausibility of emulating code self-modification on languages that don't allow code self-modification by making processes modify other processes

Perhaps the biggest influence on this is what OS you are using. I suggest you simply go ahead and try it, i.e. do some benchmarking. — stakx - no longer contributing, Jun 18 '15 at 06:31
An intermediate way is to do [remote procedure call](https://en.wikipedia.org/wiki/Remote_procedure_call) to some outside server (this gives you nearly the same isolation than starting an external program, but you won't restart the server for each RPC invocation). In some cases, you could easily write an ad-hoc RPC server yourself, for your specific needs. — Basile Starynkevitch, Jun 18 '15 at 06:44
I forgot to tell that RPC can also go to some server program on the *same* computer, and then the required [IPC](https://en.wikipedia.org/wiki/Inter-process_communication) can be really fast. Details are of course operating system specific. — Basile Starynkevitch, Jun 18 '15 at 07:00
@BasileStarynkevitch Very roughly speaking, how long do remote procedure calls take compared to normal subroutine calls? — Kelmikra, Jun 18 '15 at 10:02
It is OS specific, and often (but not always) depends on the size of the argument data. On Linux using pipes or FIFOs or `AF_UNIX` sockets for RPC, you can get several hundreds of megabytes per second of IPC bandwidth. So short RPC calls might take a few hundreds of microseconds, but you really need to benchmark. — Basile Starynkevitch, Jun 18 '15 at 10:06
Please explain **why do you ask** and what actual kind of "call" (i.e. processing by some external program) do you have in mind? And on which operating system? So please **edit your question** to motivate and improve it. — Basile Starynkevitch, Jun 18 '15 at 10:08

score 3 · Accepted Answer · edited May 23 '17 at 11:51

Like I said in my comment above, perhaps it would be best if you simply tried it and did some benchmarking. I'd expect this to depend primarily on the OS you're using.

That being said, starting a new process generally is many orders of magnitude slower than calling a subroutine (I'm tempted to say something like "at least a million times slower", but I couldn't back up such a claim with any measurements).

Possible reasons why starting a process is much slower:

Disk I/O (the OS has to load the process image file into memory) — this is going to be a big factor because I/O is many orders of magnitude slower than a simple CPU jump/call instruction.

To give you a rough idea of the orders of magnitude involved, let me quote this 2011 blog article (which is about memory access vs HDD access, not CPU jump instruction vs HDD access):

"Disk latency is around 13ms, but it depends on the quality and rotational speed of the hard drive. RAM latency is around 83 nanoseconds. How big is the difference? If RAM was an F-18 Hornet with a max speed of 1,190 mph (more than 1.5x the speed of sound), disk access speed is a banana slug with a top speed of 0.007 mph."

You do the math.
allocations of memory & other kernel data structures
laying out the process image in memory & performing relocations
creation of a new OS thread
context switches

etc.

Apparently, all of the above points mean that your OS is likely to perform lots of internal subroutine calls to start a new process, so doing just one subroutine call yourself instead of having the OS do hundreds of these is bound to be comparatively super-cheap.

Disk IO does not matter if the file already sits in the [page cache](https://en.wikipedia.org/wiki/Page_cache), and this often happens for frequently used files (or executables). — Basile Starynkevitch, Jun 18 '15 at 06:48
@BasileStarynkevitch: Of course, but perhaps you cannot rely on that. Given that portability between OSes is not an issue, and that the OS you're using gives you certain caching guarantees, then disk I/O indeed might not matter much (except on the first process load). But in all other cases I'd rather assume a worst-case scenario. — stakx - no longer contributing, Jun 18 '15 at 06:51

In general, how expensive is calling an external program?

1 Answers1