2

Code swap: Achieving code swapping in Erlang's gen_server

Module redefine:

iex(node2@127.0.0.1)6> Code.load_file("mesngr.ex", "./lib")
[{Mesngr,
  <<70, 79, 82, 49, 0, 0, 12, 72, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 255, 131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115, 95, 118, 49, 108, 0, 0, 0, 4, 104, 2, 100, 0, ...>>}]
iex(node2@127.0.0.1)8> Code.load_file("mesngr.ex", "./lib")
lib/mesngr.ex:1: warning: redefining module Mesngr
[{Mesngr,
  <<70, 79, 82, 49, 0, 0, 12, 72, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 255, 131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115, 95, 118, 49, 108, 0, 0, 0, 4, 104, 2, 100, 0, ...>>}]
iex(node2@127.0.0.1)9>

I already noticed certain differences, such as that GenServer's code_change callback wouldn't be called in the case of a module redefine (since what I assume is just an overwrite, not a transition from new -> current and current -> old). But I also notice that re-defining a module like this does change the underlying code (which make sense in an FP language).

I guess my question boils down to the following:

  1. How good/bad/ugly would it be to simply redefine modules on the fly during development, on production?
  2. What are the benefits of doing proper code swapping, outside of version management and rollbacks? Are there good tutorials/manual/articles on the subject?
  3. How does hot code swapping work under the hood, versus module redefinition?
Community
  • 1
  • 1
Marc Trudel
  • 1,244
  • 1
  • 12
  • 19

2 Answers2

4

In Erlang at any given point in time you can have max two versions of given module actively running.

Example:

  • load a module
  • start gen_server using the code
  • change something and load module' - now gen_server is still running old code
  • start second gen_server - it will run new code from module'
  • change something again and load module'' - because there can be only two versions of the same code at any point in time, the first module is purged; all processes running it are killed, so your first gen_server is killed

You can transition from running old code to running new code inside process by calling a fully qualified function from that module. So instead of calling function(Args) you would do module:function(Args). I am not sure if the same mechanism works with Elixir though.

To do hot code upgrades manually, you must have this fully qualified function call in your first version of module.

If you changed state representation in gen_server, this would probably crash the server anyway.

So to answer your questions:

  1. It would be pretty bad. Doing this stuff manually means only manual tests for the upgrade. It can lead to random processes dying, loosing state and in extreme cases loosing data.

  2. Hot code swapping with appups and relups is fully automated. You don't need "hooks" like the fully qualified function in your first code version, so you can start worrying about it when you have the need to do hot code upgrade. Code change will be properly called giving you chance to transform old state to new state. In case where you modify more than one process, you can define group of modules that should be frozen during the upgrade. I think a good tutorial is here: http://learnyousomeerlang.com/relups

  3. Hot code swapping is just starting to use the new module definition by process that used previous module definition. Doing proper appup and relup deals with code loading, swapping, changing state and purging old stuff. Makes it a little bit more testable.

Hot code swapping is not easy and in production when you have distributed system, it might be better to start completely new instance with new code. Load balancer can push new connections to new instance and we just wait for old connections to die before killing the old instance.

However, in embedded systems where you have one instance and need to be online 24/7 hot code swapping might be the only option.

tkowal
  • 9,129
  • 1
  • 27
  • 51
1

A caveat of "So instead of calling function(Args) you would do Module.function(Args)." that @tkowal mentioned is that in Elixir imports are still fully-qualified calls even if they don't look like it in the literal source code as written in Elixir.

defmodule Foo do
  import Bar, only: [baz: 0]

  def foo do
    # reloads Bar
    baz
  end
end
Luke Imhoff
  • 802
  • 10
  • 11