I admit this is a niggle, but I want to implement or find a pipe operator in Python
I work in data science, and a lot of the codebase looks like this:
# raw_data is a pandas dataframe
data_clean = (data_raw
.pipe(fill_nulls, **kwargs1)
.pipe(add_feature_1, **kwargs2)
.pipe(add_feature_2, **kwargs3)
.pipe(add_feature_3, **kwargs4)
.pipe(normalize_data, **kwargs5)
)
However, several tools don't implement a .pipe method in their objects. That means some parts of my code look as follows:
very_clean_data = func_4(
func_3(
func_2(
func_1(data_clean, **kwargs1),
**kwargs2),
**kwargs3),
**kwargs4)
Or they use the "hacky" functools.reduce
approach
very_clean_data = functools.reduce(lambda x, f: f(x), [
clean_data,
functools.partial(func_1, **kwargs1),
functools.partial(func_2, **kwargs2),
functools.partial(func_3, **kwargs3),
functools.partial(func_4, **kwargs4)
]
)
I'd love it if Python had an operator, eg. |>
or :>
or something that worked like a .pipe
method. I think it'd let me write cleaner code.
Something like:
very_clean_data = (clean_data
|> (func_1, **kwargs1)
|> (func_2, **kwargs2)
|> (func_3, **kwargs3)
|> (func_4, **kwargs4)
)
I saw the pipes
library (see this Medium Article) but I don't think it solves my problem, and I also worry because |
is a heavily used operator in my codebase (for dictionary unions & logical ORs)