1

I'd like to detect calls to print() and logging (e.g. logging.info()) that are top-level reachable, i.e. that execute on module load, and fail the build if found.

I maintain a service that other teams often frequently commit to, so I want this as a lint-check of sorts in CI. How can I do this?

I don't care about non-top-level calls (e.g. calls inside a function). I'd like to continue allowing these other teams to do so if they really want, for when they exec their own code.

I've tried/encountered several things without success thus far, generally dynamic import_module of all python files I care about and then:

# foo.py
print("hello")
from importlib import import_module

def test_does_not_print(capfd):
    import_module('foo')
    out, err = capfd.readouterr()

    assert out == ""  # surprise: this will pass
Kache
  • 15,647
  • 12
  • 51
  • 79
  • `capfd` will capture the output in your example, just as `capsys` will as well, so it's not an issue with `pytest`. I rather assume that you have already imported `foo` before actually executing the test, so `import_module` will just take the cached module from `sys.modules`. If you clean up `sys.modules` before calling `import_module` (e.g. `sys.modules.pop('foo', None)`), or use `importlib.reload(foo)`, the import mechanism will be triggered anew and the output will be actually printed and captured. – hoefling Feb 06 '22 at 13:25
  • Can you elaborate? What I pasted/linked really is a complete and minimal repro, i.e. my code doesn't import `foo` elsewhere. A `sys.modules['foo']` before `import_module` even `KeyError`s. However, adding `sys.modules.pop('foo', None)` does indeed make it fail for `""`, as expected. This unexpected behavior smells of a `pytest` bug to me. – Kache Feb 07 '22 at 06:02
  • 1
    The repro works for me: https://replit.com/@hoefling/CrimsonThunderousElectricity?v=1 – hoefling Feb 07 '22 at 07:14
  • Thanks for checking. Just tried in a different project/venv I have, and also getting expected behavior. Must be some strange interaction bug in this project -- pytest plugin or else. Unfortunately not sure I'll have the time dive even deeper into this. – Kache Feb 07 '22 at 07:39

1 Answers1

0

NOTE: The below is a workaround, as capsys/capfd should be able to solve this problem, but doesn't work for my particular project for an unknown reason.

I've been able to accomplish this via runtime monkeypatching the print and logging.info functions in an independent script that I can run during CI, e.g.:

import builtins
from contextlib import contextmanager
import functools as ft
from importlib import import_module
import logging
import os
import sys

orig_print = builtins.print
orig_info, orig_warning, orig_error, orig_critical = logging.info, logging.warning, logging.error, logging.critical
NO_ARG = object()
sys.path.insert(0, 'src')


def main():
    orig_print("Checking files for print() & logging on import...")
    for path in files_under_watch():
        orig_print("  " + path)
        output = detect_toplevel_output(path)
        if output:
            raise SyntaxWarning(f"Top-level output (print & logging) detected in {path}: {output}")


def files_under_watch():
    for root, _, files in os.walk('src'):
        for file in files:
            if should_watch_file(file):  # your impl here
                yield os.path.join(root, file)


def detect_toplevel_output(python_file_path):
    with capture_print() as printed, capture_logging() as logged:
        module_name = python_file_path[:-3].replace('/', '.')
        import_module(module_name)

    output = {'print': printed, 'logging': logged}
    return {k: v for k, v in output.items() if v}


@contextmanager
def capture_print():
    calls = []

    @ft.wraps(orig_print)
    def captured_print(*args, **kwargs):
        calls.append((args, kwargs))
        return orig_print(*args, **kwargs)

    builtins.print = captured_print
    yield calls
    builtins.print = orig_print


@contextmanager
def capture_logging():
    calls = []

    @ft.wraps(orig_info)
    def captured_info(*args, **kwargs):
        calls.append(('info', args, kwargs))
        return orig_info(*args, **kwargs)

    @ft.wraps(orig_warning)
    def captured_warning(*args, **kwargs):
        calls.append(('warning', args, kwargs))
        return orig_warning(*args, **kwargs)

    @ft.wraps(orig_error)
    def captured_error(*args, **kwargs):
        calls.append(('error', args, kwargs))
        return orig_error(*args, **kwargs)

    @ft.wraps(orig_critical)
    def captured_critical(*args, **kwargs):
        calls.append(('critical', args, kwargs))
        return orig_critical(*args, **kwargs)

    logging.info, logging.warning, logging.error, logging.critical = captured_info, captured_warning, captured_error, captured_critical
    yield calls
    logging.info, logging.warning, logging.error, logging.critical = orig_info, orig_warning, orig_error, orig_critical


if __name__ == '__main__':
    main()
Kache
  • 15,647
  • 12
  • 51
  • 79