8

I doubt anyone can help with this question because of the following in Erlang's compile documentation:

Note that the format of assembler files is not documented, and may change between releases - this option is primarily for internal debugging use.

... but just in case, here goes the story's stack trace:

  • compile:file/2 with ['S'] to generate the assembly code
  • Read .S file and create a key-value data structure with key being the 'function' tuples in the .S file, and value being the function's body, i.e. the assembly instructions which implement the function.
  • Modify data structure by adding the assembly to make an external function call in certain functions.
  • crash...

Unfortunately I just quickly skimmed the .S files generated when compiling a module containing the following function, with and without the first expression in the function commented out:

spawn_worker(Which) ->
    %syner:sync_pt(),
    case Which of
            ?NAIVE -> spawn(err1, naive_worker_loop, [])
    end.

When I did that, I thought that the only thing that changed was the tuple:

{call_ext,0,{extfunc,syner,sync_pt,0}}.

... so I assumed that the only thing necessary to inject a function call in the assembly was to add that tuple... but now that I got to actually injecting the tuple... I'm seeing that the assembly generated has some extra instructions:

Without syner:sync_pt():

{function, spawn_worker, 1, 4}.
{label,3}.
    {func_info,{atom,err1},{atom,spawn_worker},1}.
{label,4}.
    {test,is_eq_exact,{f,5},[{x,0},{atom,naive}]}.
    {move,{atom,naive_worker_loop},{x,1}}.
    {move,nil,{x,2}}.
    {move,{atom,err1},{x,0}}.
    {call_ext_only,3,{extfunc,erlang,spawn,3}}.
{label,5}.
    {case_end,{x,0}}.

With syner:sync_pt():

{function, spawn_worker, 1, 4}.
{label,3}.
    {func_info,{atom,err1},{atom,spawn_worker},1}.
{label,4}.
    {allocate,1,1}.
    {move,{x,0},{y,0}}.
    {call_ext,0,{extfunc,syner,sync_pt,0}}.
    {test,is_eq_exact,{f,5},[{y,0},{atom,naive}]}.
    {move,{atom,naive_worker_loop},{x,1}}.
    {move,nil,{x,2}}.
    {move,{atom,err1},{x,0}}.
    {call_ext_last,3,{extfunc,erlang,spawn,3},1}.
{label,5}.
    {case_end,{y,0}}.

I can't just conclude that adding something like:

   {allocate,1,1}.
   {move,{x,0},{y,0}}.
   {call_ext,0,{extfunc,syner,sync_pt,0}}.

to every function I want to inject an external function call into, will do the trick.

  1. because I'm not sure if that assembly code applies to all functions I want to inject into (e.g. is {allocate,1,1} always ok)
  2. because if you take a closer look at the rest of the assembly, it changes slightly (for e.g. {call_ext_only,3,{extfunc,erlang,spawn,3}}. changes to {call_ext_last,3,{extfunc,erlang,spawn,3},1}.).

So now the question is, is there any resource out there I can use to understand and manipulate the assembly generated by Erlang's compile:file/2?

I'm asking this question just in case. I doubt there is a resource for this since the documentation clearly states that there isn't, but I have nothing to lose I guess. Even if there was, it seems like manipulating the assembly code is going to be trickier than I'd like it to be. Using parse_transform/2 is definitely easier, and I did manage to get something similar to work with it... just trying out different alternatives.

Thanks for your time.

Kijewski
  • 25,517
  • 12
  • 101
  • 143
justin
  • 3,357
  • 1
  • 23
  • 28
  • (Telling from testing, not from reading sources.) `{allocate,1,1}` allocates one place on the y stack. The x stack is the "normal" stack, the registers. The y stack is an auxiliary stack. `{move,{x,0},{y,0}}` moves the topmost x to the bottommost y (at least think they grow in reverse direction). My test file: http://pastebin.com/R21ZJ29Q. Its result: http://pastebin.com/jULjMCV0. – Kijewski Oct 29 '11 at 06:21
  • Thanks for the help kay. It looks like figuring the assembly out will take me too much time though, so I'm not planning on decrypting it. Cheers. – justin Oct 29 '11 at 14:25

3 Answers3

2

I'm not sure what are you trying to achieve with this, but core erlang might be a better level to do code manipulation. This is documented (well one version of it anyhow, but this is still better than nothing) here (and there are more just google for core erlang):

To compile to and from core erlang use the 'to_core' and 'from_core' (unfortunately undocumented) options:

c(my_module, to_core). %%this generates my_module.core
c(my_module, from_core). %%this loads my_module.core
TamasNagy
  • 21
  • 2
  • 1
    I think that core erlang is what is used by lfe (https://github.com/rvirding/lfe), you could find some useful info there – Lukas Oct 29 '11 at 12:57
  • hmm ye I quickly read that paper but didn't try using core Erlang yet. I guess I'll look into it a bit more. Thanks for the pointer TamasNagy. – justin Oct 29 '11 at 14:38
  • lfe uses core erlang huh. Cool, all the more reason to check it out. Thanks Lukas. – justin Oct 29 '11 at 14:39
2

The only tool I know of that use the beam asm is HiPE, there is alot of code examples and such in https://github.com/erlang/otp/tree/master/lib/hipe/icode, though I would not recommend doing much with this format as it is changing all the time.

Lukas
  • 5,182
  • 26
  • 17
1

It seemed to me that what you're trying to do could easily be addressed by manipulating the abstract code. Just write a parse_transform module within which you could insert the function call into the functions in question.

I wrote a ?funclog based on the erlang decorators. see https://github.com/nicoster/erl-decorator-pt#funclog

Nick X
  • 176
  • 1
  • 7