0

On this page, there is this log_step function which is being used to record what each step in a pandas pipeline is doing. The exact function is:

from functools import wraps
import datetime as dt

def log_step(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        tic = dt.datetime.now()
        result = func(*args, **kwargs)
        time_taken = str(dt.datetime.now() - tic)
        print(f"just ran step {func.__name__} shape={result.shape} took {time_taken}s")
        return result
    return wrapper

and it is used in the following fashion:

import pandas as pd

df = pd.read_csv('https://calmcode.io/datasets/bigmac.csv')

@log_step
def start_pipeline(dataf):
    return dataf.copy()

@log_step
def set_dtypes(dataf):
    return (dataf
            .assign(date=lambda d: pd.to_datetime(d['date']))
            .sort_values(['currency_code', 'date']))

My question is: how do I keep the @log_step in front of my functions and be able to use them at will, while setting the results of @log_step by default, to not be outputed when I run my Jupyter notebook? I suspect the answer comes down to something more general about using decorators but I don't really know what to look for. Thanks!

1 Answers1

0

You can indeed remove the print statement or, if you want to not alter the decorating function, you can as well redirect the sys to avoid seeing the prints, as explained here.

Limtorak
  • 81
  • 5