1

I am trying to figure out how to only get the source code of the body of the function.

Let's say I have:

def simple_function(b = 5):
    a = 5
    print("here")
    return a + b

I would want to get (up to indentation):

""" 
    a = 5
    print("here")
    return a + b
"""

While it's easy in the case above, I want it to be agnostic of decorators/function headers, etc. However, still include inline comments. So for example:

@decorator1
@decorator2
def simple_function(b: int = 5):
    """ Very sophisticated docs
    """
    a = 5
    # Comment on top
    print("here")  # And in line
    return a + b

Would result in:

""" 
    a = 5
    # Comment on top
    print("here")  # And in line
    return a + b
"""

I was not able to find any utility and have been trying to play with inspect.getsourcelines for few hours now, but with no luck.

Any help appreciated!

Why is it different from How can I get the source code of a Python function?

This question asks for a whole function source code, which includes both decorators, docs, def, and body itself. I'm interested in only the body of the function.

esqew
  • 42,425
  • 27
  • 92
  • 132
balbok
  • 394
  • 3
  • 14
  • 3
    Decorators make this basically untenable. When a decorator is applied to a function, it completely *replaces* the function in an arbitrary way. There's no general-purpose mechanism to get the original function back and, indeed, it's possible to write a decorator that completely discards and replaces the original function, without storing it anywhere at all. – Silvio Mayolo Jun 27 '21 at 01:29
  • 6
    For my own curiosity, can you elaborate on what gives rise to this requirement? What do you hope to achieve with this string? – esqew Jun 27 '21 at 01:31
  • @enzo No, since getsource still contains def, decorators and any doc comments. – balbok Jun 27 '21 at 01:33
  • 1
    @balbok Just do some string manipulation to remove all the lines until the first `def` appears. – enzo Jun 27 '21 at 01:35
  • @esqew I am trying to build a utility that given problem solution can generate a problem statement. However, to do so, I need to substitute the solution code with the problem statement. I know there are easier ways of going about them, but all require keeping at least 2 copies of a header in sync. – balbok Jun 27 '21 at 01:35
  • @enzo I think I will need to :/ It's the solution I was trying to avoid. – balbok Jun 27 '21 at 01:36
  • I think you could do something like this without keeping two copies of something, but to give a better answer you may need to rewrite this question (or open a new one) with a more comprehensive statement of the original task, since it sounds like you might have an XY Problem. – Iguananaut Jun 27 '21 at 01:43
  • @balbok: Can you explain why you need to do this at all, and also why without access to the source code files? It's very unusual to want inline comments, and it seems pretty obvious you need access to the actual source code file. (what do you think the interpreter does with comments other than discrad them?) – smci Jun 27 '21 at 03:50
  • The idea is as follows: to have a repo containing solutions for a class. Then, when a homework is due to be posted, we are using invoke tasks to copy the files, but in assigned functions remove the code. This way we have one codebase for both solutions and problem sets. Additionally, we can provide students with unit tests for validation, but also have hidden tests that compare their code with solution. While it may seem complicated, everything expect for stripping the code from function bodies is already setup and working. That's the last roadblock remaining. – balbok Jun 27 '21 at 04:15

1 Answers1

1

I wrote a simple regex that does the trick. I tried this script with classes and without. It seemed to work fine either way. It just opens whatever file you designate in the Main call, at the bottom, rewrites the entire document with all function/method bodies doc-stringed and then save it as whatever you designated as the second argument in the Main call.

It's not beautiful, and it could probably have more efficient regex statements. It works though. The regex finds everything from a decorator (if one) to the end of a function/method, grouping tabs and the function/method body. It then uses those groups in finditer to construct a docstring and place it before the entire chunk it found.

import re

FUNC_BODY   = re.compile(r'^((([ \t]+)?@.+\n)+)?(?P<tabs>[\t ]+)?def([^\n]+)\n(?P<body>(^([\t ]+)?([^\n]+)\n)+)', re.M)
BLANK_LINES = re.compile(r'^[ \t]+$', re.M)

class Main(object):
    def __init__(self, file_in:str, file_out:str) -> None:
        #prime in/out strings
        in_txt  = ''
        out_txt = ''
        
        #open resuested file
        with open(file_in, 'r') as f:
            in_txt = f.read()
        
        #remove all lines that just have space characters on them
        #this stops FUNC_BODY from finding the entire file in one shot
        in_txt = BLANK_LINES.sub('', in_txt)

        last   = 0 #to keep track of where we are in the file

        #process all matches
        for m in FUNC_BODY.finditer(in_txt):
            s, e = m.span()
            #make sure we catch anything that was between our last match and this one
            out_txt = f"{out_txt}{in_txt[last:s]}"
            last    = e
            tabs    = m.group('tabs') if not m.group('tabs') is None else ''
            #construct the docstring and inject it before the found function/method
            out_txt = f"{out_txt}{tabs}'''\n{m.group('body')}{tabs}'''\n{m.group()}"
            
        #save as requested file name    
        with open(file_out, 'w') as f:
            f.write(out_txt)
            
            
if __name__ == '__main__':
    Main('test.py', 'test_docd.py')

EDIT:

Apparently, I "missed the entire point" so I wrote it again a different way. Now you can get the body while the code is running and decorators don't matter, at all. I left my other answer here because it is also a solution, just not a "real time" one.

import re, inspect

FUNC_BODY   = re.compile('^(?P<tabs>[\t ]+)?def (?P<name>[a-zA-Z0-9_]+)([^\n]+)\n(?P<body>(^([\t ]+)?([^\n]+)\n)+)', re.M)

class Source(object):
    @staticmethod
    def investigate(focus:object, strfocus:str) -> str:
        with open(inspect.getsourcefile(focus), 'r') as f:
            for m in FUNC_BODY.finditer(f.read()):
                if m.group('name') == strfocus:
                    tabs = m.group('tabs') if not m.group('tabs') is None else ''
                    return f"{tabs}'''\n{m.group('body')}{tabs}'''"
            

def decorator(func):                                                                                            
    def inner():                                                                                            
        print("I'm decorated")                                                                                     
        func()                                                                                              
    return inner  

@decorator 
def test():
    a = 5
    b = 6
    return a+b

print(Source.investigate(test, 'test'))   
OneMadGypsy
  • 4,640
  • 3
  • 10
  • 26
  • 1
    This answer completely misses the point, of course. He wanted to get the source code of a function while it is executing, without reading the source code. I'm not going to down vote the answer because it does literally solve the problem, but it's not the solution the asker wanted. – Tim Roberts Jun 27 '21 at 03:33
  • @TimRoberts ~ My answer could still be used for that, with very little manipulation. Just feed it the data from `getsourcelines` and comment out the file open part. It's still the right answer, just formatted incorrectly because I misunderstood the depth of the question. – OneMadGypsy Jun 27 '21 at 03:44
  • @TimRoberts ~ I've almost got it. Maybe 10 more minutes and I'll change my answer to the solution. – OneMadGypsy Jun 27 '21 at 04:00
  • Thank you so much! It seems to have issues with functions with docstring, but this gave me an idea for a new approach. – balbok Jun 27 '21 at 04:21
  • 1
    @balbok I rewrote it so you could get the info while thee code is running, and decorators are not an issue. Docstrings shouldn't be an issue either. – OneMadGypsy Jun 27 '21 at 04:32