1

Say, I have two adjacent functions subfunc() and main() in the Mach-O executable and want to disassemble all instructions from subfunc() to main()+0x10.

I know I can cast functions to addresses using `(void(*)())subfunc` - isn't there an easier way?

My attempt is as follows, but I get the error message below:

dis -s `(void(*)())subfunc` -e `(void(*)())main+0x10` error: error: arithmetic on a pointer to the function type 'void ()'

How can I fix this?

Shuzheng
  • 11,288
  • 20
  • 88
  • 186
  • If you have debug information, so lldb knows that these are functions, there is a special case where you don't use backticks and it does the right thing -- `dis -s subfunc -e main+16` -- but I just checked and it doesn't look like that works when lldb doesn't know that these are function symbols right now. Obviously there aren't any valid C expressions for adding an offset to a function name, so there's some extra trickery in lldb to allow for this (commonly useful) address expression. I can't remember why it works with an argument that takes an address expression and not a backtick eval r.n. – Jason Molenda Oct 23 '19 at 22:21
  • @JasonMolenda - Thanks. Aren't an address expression and a backtick eval not the same thing? Can you point me to the documentation for these things? – Shuzheng Oct 24 '19 at 06:42
  • main+5 is not a valid C expression, yet it is a very convenient way to specify addresses. Instead of modifying the expression parser to accept this form (we want to keep the expression parser accurate to the source language), we added "address expressions". They try first to evaluate their argument as an expression and if that fails, they will parse simple expressions of the form "symbolname+-offset" "help address-expression" is where this should be documented, but that help string is a bit terse... – Jim Ingham Oct 24 '19 at 17:29
  • @JimIngham, is there no elaborate documentation on address expressions or backqoute expressions anywhere to be found? Why don't `main+5` work, if `+-` works? Isn't `main` a symbol? – Shuzheng Oct 24 '19 at 17:57

1 Answers1

2

This appears to be the correct syntax:

dis --start-address `(void(*)())main` --end-address `(void(*)())main`+0x10

The very small difference between this syntax and the variant you tried is that the +0x10 offset goes outside the backtick characters, i.e. the offset goes after the closing backtick.

FWIW this variant also appears to work correctly:

dis --start-address `(void(*)())main` --end-address 0x10+`(void(*)())main`

Discovery process:

  • I was unfamiliar with the "backtick" + function cast that you described in your original question so that was a very helpful starting point.

    In my case I was trying to set a breakpoint at a function offset inside a shared library and got about as far as this before my search landed me on your question:

    breakpoint set --shlib libexample.dylib --address `((void*)some_function)+81`
    
    error: error: function 'some_function' with unknown type must be given a function type
    error: 1 errors parsing expression
    
  • The use of your function cast hint met the "function type" requirement stated in the error message so I was next able to get to:

    print (void(*)())some_function
    
    (void (*)()) $38 = 0x00000001230094d0 (libexample.dylib`some_function)
    
  • I then tried the backtick variant which appeared to work but I wanted the value to be displayed in hexadecimal:

    print `(void(*)())some_function`
    (long) $2 = 4882207952
    
  • But when I tried to use the -f hex format option with print I got an error:

    print -f hex `(void(*)())some_function`
    
    error: use of undeclared identifier 'f'
    error: 1 errors parsing expression
    
  • Eventually I noticed the comment 'print' is an abbreviation for 'expression --' at the bottom of the help print output and realised that means it's (apparently?) not possible to use an alternative display format with print because it gets converted into expression -- -f hex ... which is not valid syntax.

  • Eventually I figured out the required placement & combination of command name, display format and "--" to make it display as desired:

    expression -f hex -- `(void(*)())some_function`
    
    (long) $7 = 0x00000001230094d0
    
  • For no particular reason (that I can remember) it was at this point I tried placing the offset outside the backticks and it worked!

    expression -f hex -- `(void(*)())some_function`+81
    
    (long) $12 = 0x0000000123009521
    
  • And it still worked when I tried it with a breakpoint:

    breakpoint set --shlib libexample.dylib --address `(void(*)())some_function`+81
    
    Breakpoint 6: where = libexample.dylib`some_function + 81, address = 0x0000000123009521
    
  • Then I verified that it also worked with the dis command from your original question:

    dis --start-address `(void(*)())some_function` --end-address `(void(*)())some_function`+81
    
  • And confirmed that the bare function name was not sufficient:

    dis --start-address some_function --end-address `(void(*)())some_function`+81
    
    error: address expression "some_function" evaluation failed
    
  • I also re-confirmed that the offset being between the backticks did not work:

    dis --start-address `(void(*)())some_function` --end-address `(void(*)())some_function+1`
    
    error: error: arithmetic on a pointer to the function type 'void ()'
    error: 1 errors parsing expression
    
  • It was at this point that I realised I was able to parse the error message (as it was presumably intended):

    [arithmetic on a pointer]  [to the function type]  ['void ()']
    

    The underlying issue being "arithmetic on a pointer"...

  • Which further research shows is both "undefined on pointers to function types" and available as a gcc extension:

  • Which brings us back to the comments by @JasonMolenda & @JimIngham and how the function pointer arithmetic parsing is special-cased.

    To my mind the "error: arithmetic on a pointer to the function type..." message you received is at best poor UX & at worst a bug--given that lldb itself essentially displays address references in that manner:

    0x1230094f9:  jle    0x123009cc2               ; some_function + 2034
    

    I feel similarly about libexample.dylib`some_function + 81 being displayed but AFAICT not being parsed.

  • In conclusion, this form works:

    `(void(*)())some_function`+0x10
    

    Now I just need to figure out why some_function isn't doing what I think it should... :)

follower
  • 45
  • 4