6

Python's -O switch strips asserts from the code it compiles.

Python's -OO switch does this and also strips docstrings.

Is there any way to make Python strip docstrings but not asserts?

In particular, is this possible from the command-line or from using the built-in compile function?

xorsyst
  • 7,897
  • 5
  • 39
  • 58
  • 1
    Possible duplicate of [Remove doc strings but not asserts from compiled CPython](https://stackoverflow.com/questions/11496532/remove-doc-strings-but-not-asserts-from-compiled-cpython) – scharette Jul 09 '18 at 15:08
  • 3
    Out of curiosity may I ask what is your use-case for stripping docstrings preserving asserts? – Steven Rumbalski Jul 09 '18 at 15:16
  • Well, that's 2 questions really. 1. I'd rather hit an assert than have the code merrily go on under false assumptions and fail in a cryptic way 2. docstrings are effectively comments and I don't want them in compiled code that I distribute to 3rd parties. – xorsyst Jul 09 '18 at 15:40

1 Answers1

1

It's bit hacky, but you could generate Abstract Syntax Trees (asts) for your code, remove anything that looks like a docstring, and then pass the changed asts to compile.

Given this module:

$  cat func.py 
"""
This is  module-level docstring.
"""

def f(x):
    """
    This is a doc string
    """
    # This is a comment
    return 2 * x

First, generate the ast from the module source code.

>>> import ast
>>> with open('func.py') as f:
...     src = f.read()
... 
>>> tree = ast.parse(src)

Dumping the ast shows the docstrings are present (comments are not included in asts)

>>> ast.dump(tree)
"Module(body=[Expr(value=Str(s='\\nThis is  module-level docstring.\\n')), FunctionDef(name='f', args=arguments(args=[arg(arg='x', annotation=None)], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Expr(value=Str(s='\\n    This is a doc string\\n    ')), Return(value=BinOp(left=Num(n=2), op=Mult(), right=Name(id='x', ctx=Load())))], decorator_list=[], returns=None)])"

Now the hacky part: define a visitor that will visit each node in the ast, removing docstrings. A naive implmentation could just remove any expressions that are just strings that are not part of an assignment.

>>> class Transformer(ast.NodeTransformer):
...
...     def visit_Expr(self, node):
...         if isinstance(node.value, ast.Str):                      
...             return None
...         return node

This might be problematic if the code contains multi-line strings (though I haven't tested this).

A safer implementation might remove the first node from any module, function or class definitions if the node is an expression node and its value is a string node (if the string were being bound to a name the node would be an assignment node, not an expression).

class Transformer(ast.NodeTransformer):

    def visit_Module(self, node):
        self.generic_visit(node)
        return self._visit_docstring_parent(node)

    def visit_FunctionDef(self, node):
        self.generic_visit(node)
        return self._visit_docstring_parent(node)

    def visit_ClassDef(self, node):
        self.generic_visit(node)
        return self._visit_docstring_parent(node)

    def _visit_docstring_parent(self, node):
        # Common docstring removal code.
        # Assumes docstrings will always be first node in
        # module/class/function body.
        new_body = []
        for i, child_node in enumerate(node.body):
            if i == 0 and isinstance(child_node, ast.Expr) and isinstance(child_node.value, ast.Str):
                pass
            else:
                new_body.append(child_node)
        node.body = new_body
        return node

>>> # Transformer performs an in-place transformation.
>>> Transformer().visit(tree)

Observe the docstrings are not longer present in the new ast:

>>> ast.dump(tree)
"Module(body=[FunctionDef(name='f', args=arguments(args=[arg(arg='x', annotation=None)], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Return(value=BinOp(left=Num(n=2), op=Mult(), right=Name(id='x', ctx=Load())))], decorator_list=[], returns=None)])"

The new ast can be compiled to a code object and executed:

>>> ast.fix_missing_locations(new_tree)
>>> code_obj = compile(new_tree, '<string>', mode='exec')

>>> exec(code_obj, globals(), locals())
>>> globals()['f']
<function f at 0x7face8bc2158>
>>> globals()['f'](5)
10
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153