I would like to know what is the suggested way (if any) to move a gen_server/gen_fsm from erlang node A to erlang node B preserving its internal state.
1 Answers
AFAIK there's no way to transfer a process between erlang nodes and I can think about many reasons to forbid this, between the others you may mess with internal nodes memory, simply consider a process which holds data (other than in the internal 'State' loop parameter) in process dictionary (process heap), binary (different garbage collection method).
One workaround could be to provide the gen_fsm/gen_server with a method that can spawn a new process recreating at the same time the internal state of the server/state machine. I think it's more difficult to say that to implement, you could simply use two start functions:
- one that initializes the behaviour (like I think you're doing right now)
- one that takes also a node and start through remote method call the server on that node and initializes state (by init/1 function or in an explicite way by sending a message, i.e. the state of the server)
But I must say that I see two main problems here:
- Synchronization: one needs to make sure the process: start server on remote node -> set remote server state -> kill current local server is atomic
- Coherence: other processes referring to local one must switch their reference to the remote one
The former could be resolved in many ways (my two cents: explicit message passing between local and remote server - overhead but bulletproof considering Erlang runtime system), the latter could be resolved with monitor/links and exit return values (the remote server pid) or in a more elegant way with a publish/subscribe model with a gen_event process.
I hope you find this useful to resolve your issue and ask anything if you need!

- 3,787
- 26
- 42
-
Hi, thanks for you reply. In fact I just need to move a gen_fsm to a different node mantaining its state. It should also delete itself from the supervisor of node A and add itself to supervisor on node B. This gives me some troubles since for a very small time it will not be supervised I guess... – user601836 Mar 06 '12 at 16:47
-
What is your supervisor strategy? If it's a one on one you can delete and add processes to the tree without problems. About the lapse in which processes are not supervised, it's what I'm referring above as 'atomic' operation. You could have a procedure that listen to both processes and kill the local one if and only if the migration process has completed successfully, but you should take into account possible state changes (which you must absolutely abide!!!). – Vincenzo Maggio Mar 06 '12 at 16:57
-
1) yes, one for one 2) the simplest solution i can think of is the one in which the new spawned gen_fsm will send a call to the original one after registering to the supervisor. The call will trigger the first instance to exit in normal way – user601836 Mar 06 '12 at 17:05
-
Only if you do it in a synchronous way, i.e. with a series of send/receive in local server to ensure atomicity of all the operations (start remote server, align state, register remote server/align global name reference, exit local server). If not, you risk to incur in a race condition, i.e. a local server state change between start and end of operations. – Vincenzo Maggio Mar 06 '12 at 22:26