What is the ideal way to emulate process replacement on Windows?

Question

So, in a feature request I filed against Node.js, I was looking for a way to replace the current Node process with another. In Linux and friends (really, any POSIX-compliant system), this is easy: use execve and friends and call it a day. But obviously, that won't work on Windows, since it only has CreateProcess (which execve and friends delegate to, complete with async behavior). And it's not like people haven't wanted to do similar, leading to numerous duplicate questions on this site. (This isn't a duplicate because it's explicitly seeking a workaround given certain constraints, not just asking for direct replacement.)

Process replacement has several facets that have to addressed:

All console I/O streams have to be forwarded to the new process.
All signals need transparently forwarded to the new process.
The data from the old process have to be destroyed, with as many resources reclaimed as possible.
All pre-existing threads and child processes should be destroyed.
All pre-existing handles should be destroyed apart from open file descriptors and named pipes/etc.
Optimally, the old process's memory should be kept to a minimum after the process is created.
For my particular use case, retaining the process ID is not important.

And for my particular case, there are a few constraints:

I can control the initial process's startup as well as the location of my "process replacement" function.
I could load arbitrary native code via add-ons at potentially any stack offset.
- Implication: I can't even dream of tracking malloc calls, handles, thread manipulation, or process manipulation to track and free them all, since DLL rewriting isn't exactly practical.
I have no control over when my "process replacement" is called. It could be called through an add-on, which could've been called through either interpreted code via FFI or even another add-on recursively. It could even be called during add-on initialization.
- Implication: I would have no ability to know what's in the stack, even if I perfectly instrumented my side. And rewriting all their calls and pushes is far from practical, and would just be all-around slow for obvious reasons.

So, here's the gist of what I was thinking: use something similar to a pseudo-trampoline.

Statically allocate the following:
1. A single pointer for the stack pointer.
2. MAX_PATH + 1 chars for the application path + '\0'.
3. MAX_PATH + 1 chars for the current working directory path + '\0'.
4. 32768 chars for the arguments + '\0'.
5. 32768 chars for the environment + '\0'.
On entry, set the global stack pointer reference to the stack pointer.
On "replacement":
1. Do relevant process cleanup and lock/release everything you can.
2. Set the stack pointer to the stored original global one.
3. Terminate each child thread.
4. Kill each child process.
5. Free each open handle.
6. If possible (i.e. not in a UWP program), For each heap, destroy it if it's not the default heap or the temporary heap (if it exists).
7. If possible, close each open handle.
8. If possible, walk the default heap and free each segment associated with it.
9. Create a new process with the statically allocated file/arguments/environment/etc. with no new window created.
10. Proxy all future received signals, exceptions, etc. without modification to this process somehow. The standard signals are easy, but not so much with the exceptions.
11. Wait for the process to end.
12. Return with the process's exit code.

The idea here is to use a process-based trampoline and drop the current process size to an absolute minimum while the newly created one is started.

But where I'm not very familiar with Windows, I probably made quite a few mistakes here. Also, the above seems extremely inefficient and to an extent it just feels horribly wrong for something a kernel could just release a few memory pages, deallocate a bunch of memory handles, and move some memory around for the next process.

So, to summarize, what's the ideal way to emulate process replacement on Windows with the fewest limitations?

windows simply not support process "replacement". no such conception in windows. you can create new process. you can exit\terminate existing — RbMm, Jul 05 '18 at 10:03
This is why I said "emulate". I just need it *emulated* enough that it *mostly* appears to work that way on the surface, even though it's all just a complex, hacky shim. — Claudia, Jul 05 '18 at 10:08
i not understand sense of this - for what ? and any way - absolute unclear what you want todo. start new process ? no problem. exit current. no problem. again what is "replacement' ? for what this(what?) need ? — RbMm, Jul 05 '18 at 10:11
The "for what" is [this Node feature request](https://github.com/nodejs/node/issues/21664). And I don't need it precisely simulated - I just need it emulated well enough that people aren't having to fall back to the same workaround on Windows they've been doing on all platforms the whole time. As for "replacement", [this Wikipedia article section](https://en.wikipedia.org/wiki/Exec_(system_call)#Effects) and [this Linux manpage of `exec`](http://man7.org/linux/man-pages/man3/exec.3.html) should help. Keep in mind, on Linux/Unix, they mean "process replacement" literally. — Claudia, Jul 05 '18 at 10:17
On those platforms, `exec` and similar effectively kill the old process (as if via the unrecoverable `SIGKILL`), releasing all the memory allocated by that process, create a new process in its place, and start that process as if it were the original process itself. — Claudia, Jul 05 '18 at 10:23
windows have not "signals". if process exit/killed - all it resources automatically released/destroyed. new process "in place" old. unclear what is "in place". process not hold any "place". you need some code execute - you can do this in existing process. you can do this in new process. i am even can not understand - what you try do with existing process - from one side you want it almost destroyed, but not to the end. sense ? simply `ExitProcess` and all. — RbMm, Jul 05 '18 at 10:39
True - most "signals" on Windows are emulated, but they're so broadly emulated it's easy for me to forget. :-) — Claudia, Jul 05 '18 at 10:41
The C runtime implements the 6 signals required by standard C. It has `SIGINT` and non-standard `SIGBREAK` associated with console Ctrl+C, Ctrl+Break and close events. If the new process attaches to the same console there's nothing extra to do. If not, it's not clear to me what you're intending. `SIGFPE`, `SIGILL`, and `SIGSEGV` are implemented with a structured exception handler in response to within-process exceptions. `SIGABRT` and `SIGTERM` are implemented within the C library only, for use with `abort` and `raise`. — Eryk Sun, Jul 05 '18 at 10:59
Okay, I added exceptions to clarify I'm not purely talking about signals. — Claudia, Jul 05 '18 at 11:14
First, the only way I can see this kind of replacement working on windows is if you made a wrapper application (that does not ever get replaced) which is responsible for loading a nodejs instance compiled as a dll. — Chris Becke, Jul 05 '18 at 11:38
But really, to clearly state the "what the hell are you actually asking for" sentiment some are expressing: If node-process-a gets destroyed and replaced with node-process-b, what is actually going to notice? — Chris Becke, Jul 05 '18 at 11:41

score -7 · Answer 1 · answered Jul 05 '18 at 11:57

Given that I don't understand what is actually being requested and I certainly look at things like 'execve' with a "who the hell would ever call that anyway, nothing but madness can ever result" sentiment, I nonetheless look at this problem by asking myself:

if process-a was killed and replaced by an near identical process-b - who or what would notice?

Anything that held the process id, or a handle to the process would certainly notice. This can be handled by writing a wrapper app which loads the first node process, and when prodded, kills it and loads the next. External observers see the wrapping process handles and id's unchanged.

Obviously this would cut off the stdin and stdout streams being fed into the node applications. But again, the wrapper process could get around this by passing the same set of inheritable handles to each node process launched by filling in the STARTUPINFO structure passed to CreateProcess properly.

Windows doesn't support signals, and the ones that the MS C runtime fake all deal with internal errors except one, which deals with an interactive console window being closed via ctrl-C, which the active Node.js app is sure to get anyway - or can be passed on from the wrapper as the node apps would not actually be running on the interactive console with this approach.

Other than that, everything else seems to be an internal detail of the Node.js application so shouldn't effect any 3rd party app communicating with what it thinks is a single node app via its stdin/stdout streams.

What is the ideal way to emulate process replacement on Windows?

1 Answers1