Collect python source code comments along execution path

Question

E.g. I've got the following python function:

def func(x):
    """Function docstring."""

    result = x + 1
    if result > 0:
        # comment 2
        return result
    else:
        # comment 3
        return -1 * result

And I want to have some function that would print all function docstrings and comments that are met along the execution path, e.g.

> trace(func(2))
Function docstring.
Comment 2
3

In fact what I try to achieve is to provide some comments how the result has been calculated.

What could be used? AST as far as I understand does not keep comment in the tree.

I agree that you can't find comments in the syntax tree. But if you're looking for triple-quoted strings on their own line, you're in luck: those aren't comments, so they should be accessible. — Kevin, Jan 18 '18 at 16:38
this sounds a bit like an [XY Problem](http://xyproblem.info) - is there a reason you can't use `print` statements, or the `logging` module, rather than comments? — match, Jan 18 '18 at 17:12
Playing around with this a little more, I've found that you can find every triple-quoted string in a function by `ast.walk`ing through its syntax tree; and you can use `sys.settrace` to observe every line that executes when you call a function (but string literal expression statements aren't observed in this way because they aren't "executed"). These seem like two halves of a potential solution, but I don't see an easy way to combine them. — Kevin, Jan 18 '18 at 19:12
@match comments improve readability of code more than logging statements (in my point of view). And yes, this described problem could be solved by logging. But logging does not have purpose of commenting some calculation caveats, and comments do have. Is logging a good way to add comments to code? — Alexei, Jan 19 '18 at 07:29
"Comments" like this do not really belong into code except when you actually want to trace what's going on — ThiefMaster, Jan 19 '18 at 08:17
Comments aren't that different from logs - the main difference being that they are never visible to the 'user' once the code is compiled/run. If you ever want to follow the program logic while it is being run, then logging is the correct tool for this. Since log lines are generally written in human-readable format, like comments, they can serve both purposes. — match, Jan 19 '18 at 08:49
@match almost right you are. I've already have a custom logger that collects all data with explanations about the steps of calculations. It does it also with the respect of function calls - so in the final structure of collected data you can go level deeper (see what happens inside a called function). — Alexei, Jan 19 '18 at 15:19
@match But from my point of view moving these explanation from code into comments it would provide more benefits: more explanations about math and logic for those who read code, and still these explanations are collected during execution. And logging as a mean to find errors, etc and maintain a solution running in production is not required in my case. Hope now it is more clear. — Alexei, Jan 19 '18 at 15:19
Comments will not be collected during execution unless you output them in some way - in which case you are logging (albeit under a different name/codepath. If you only want to log when debugging and developing, have a look at using the logging module with different log levels, or create your own custom logging module that when in production is a complete noop if you are worried about efficiency. — match, Jan 19 '18 at 15:26
@match I know about logging levels, it is not a proper way out. Comments and logging are different things that serve different purposes. I want to collect comments along execution path. Moving comments into logs is only a workaround. — Alexei, Jan 19 '18 at 15:34
OK - I'll not argue over semantics. But please either fix your example to actually contain comments, rather than triple-quoted strings, or stop using the word 'comment' since it's obviously causing confusion here. — match, Jan 19 '18 at 15:38

score 5 · Answer 1 · answered Jan 23 '18 at 14:06

I thought this was an interesting challenge, so I decided to give it a try. Here is what I came up with:

import ast
import inspect
import re
import sys
import __future__

if sys.version_info >= (3,5):
    ast_Call = ast.Call
else:
    def ast_Call(func, args, keywords):
        """Compatibility wrapper for ast.Call on Python 3.4 and below.
        Used to have two additional fields (starargs, kwargs)."""
        return ast.Call(func, args, keywords, None, None)

COMMENT_RE = re.compile(r'^(\s*)#\s?(.*)$')

def convert_comment_to_print(line):
    """If `line` contains a comment, it is changed into a print
    statement, otherwise nothing happens. Only acts on full-line comments,
    not on trailing comments. Returns the (possibly modified) line."""
    match = COMMENT_RE.match(line)
    if match:
        return '{}print({!r})\n'.format(*match.groups())
    else:
        return line

def convert_docstrings_to_prints(syntax_tree):
    """Walks an AST and changes every docstring (i.e. every expression
    statement consisting only of a string) to a print statement.
    The AST is modified in-place."""
    ast_print = ast.Name('print', ast.Load())
    nodes = list(ast.walk(syntax_tree))
    for node in nodes:
        for bodylike_field in ('body', 'orelse', 'finalbody'):
            if hasattr(node, bodylike_field):
                for statement in getattr(node, bodylike_field):
                    if (isinstance(statement, ast.Expr) and
                            isinstance(statement.value, ast.Str)):
                        arg = statement.value
                        statement.value = ast_Call(ast_print, [arg], [])

def get_future_flags(module_or_func):
    """Get the compile flags corresponding to the features imported from
    __future__ by the specified module, or by the module containing the
    specific function. Returns a single integer containing the bitwise OR
    of all the flags that were found."""
    result = 0
    for feature_name in __future__.all_feature_names:
        feature = getattr(__future__, feature_name)
        if (hasattr(module_or_func, feature_name) and
                getattr(module_or_func, feature_name) is feature and
                hasattr(feature, 'compiler_flag')):
            result |= feature.compiler_flag
    return result

def eval_function(syntax_tree, func_globals, filename, lineno, compile_flags,
        *args, **kwargs):
    """Helper function for `trace`. Execute the function defined by
    the given syntax tree, and return its return value."""
    func = syntax_tree.body[0]
    func.decorator_list.insert(0, ast.Name('_trace_exec_decorator', ast.Load()))
    ast.increment_lineno(syntax_tree, lineno-1)
    ast.fix_missing_locations(syntax_tree)
    code = compile(syntax_tree, filename, 'exec', compile_flags, True)
    result = [None]
    def _trace_exec_decorator(compiled_func):
        result[0] = compiled_func(*args, **kwargs)
    func_locals = {'_trace_exec_decorator': _trace_exec_decorator}
    exec(code, func_globals, func_locals)
    return result[0]

def trace(func, *args, **kwargs):
    """Run the given function with the given arguments and keyword arguments,
    and whenever a docstring or (whole-line) comment is encountered,
    print it to stdout."""
    filename = inspect.getsourcefile(func)
    lines, lineno = inspect.getsourcelines(func)
    lines = map(convert_comment_to_print, lines)
    modified_source = ''.join(lines)
    compile_flags = get_future_flags(func)
    syntax_tree = compile(modified_source, filename, 'exec',
            ast.PyCF_ONLY_AST | compile_flags, True)
    convert_docstrings_to_prints(syntax_tree)
    return eval_function(syntax_tree, func.__globals__,
            filename, lineno, compile_flags, *args, **kwargs)

It is a bit long because I tried to cover most important cases, and the code might not be the most readable, but I hope it is nice enough to follow.

How it works:

First, read the function's source code using inspect.getsourcelines. (Warning: inspect does not work for functions that were defined interactively. If you need that, maybe you can use dill instead, see this answer.)
Search for lines that look like comments, and replace them with print statements. (Right now only whole-line comments are replaced, but it shouldn't be difficult to extend that to trailing comments if desired.)
Parse the source code into an AST.
Walk the AST and replace all docstrings with print statements.
Compile the AST.
Execute the AST. This and the previous step contain some trickery to try to reconstruct the context that the function was originally defined in (e.g. globals, __future__ imports, line numbers for exception tracebacks). Also, since just executing the source would only re-define the function and not call it, we fix that with a simple decorator.

It works in Python 2 and 3 (at least with the tests below, which I ran in 2.7 and 3.6).

To use it, simply do:

result = trace(func, 2)   # result = func(2)

Here is a slightly more elaborate test that I used while writing the code:

#!/usr/bin/env python

from trace_comments import trace
from dateutil.easter import easter, EASTER_ORTHODOX

def func(x):
    """Function docstring."""

    result = x + 1
    if result > 0:
        # comment 2
        return result
    else:
        # comment 3
        return -1 * result

if __name__ == '__main__':
    result1 = trace(func, 2)
    print("result1 = {}".format(result1))

    result2 = trace(func, -10)
    print("result2 = {}".format(result2))

    # Test that trace() does not permanently replace the function
    result3 = func(42)
    print("result3 = {}".format(result3))

    print("-----")
    print(trace(easter, 2018))

    print("-----")
    print(trace(easter, 2018, EASTER_ORTHODOX))

Collect python source code comments along execution path

1 Answers1