1

I admit this is a niggle, but I want to implement or find a pipe operator in Python

I work in data science, and a lot of the codebase looks like this:

# raw_data is a pandas dataframe
data_clean = (data_raw
                 .pipe(fill_nulls, **kwargs1)
                 .pipe(add_feature_1, **kwargs2)
                 .pipe(add_feature_2, **kwargs3)
                 .pipe(add_feature_3, **kwargs4)
                 .pipe(normalize_data, **kwargs5)
)

However, several tools don't implement a .pipe method in their objects. That means some parts of my code look as follows:


very_clean_data = func_4(
                    func_3(
                      func_2(
                        func_1(data_clean, **kwargs1), 
                        **kwargs2), 
                      **kwargs3), 
                    **kwargs4)
                    

Or they use the "hacky" functools.reduce approach

very_clean_data = functools.reduce(lambda x, f: f(x), [
                                        clean_data, 
                                        functools.partial(func_1, **kwargs1),
                                        functools.partial(func_2, **kwargs2),
                                        functools.partial(func_3, **kwargs3),
                                        functools.partial(func_4, **kwargs4)
                                      ]
                                  )

I'd love it if Python had an operator, eg. |> or :> or something that worked like a .pipe method. I think it'd let me write cleaner code.

Something like:

very_clean_data = (clean_data
                      |> (func_1, **kwargs1)
                      |> (func_2, **kwargs2)
                      |> (func_3, **kwargs3)
                      |> (func_4, **kwargs4)
                  )

I saw the pipes library (see this Medium Article) but I don't think it solves my problem, and I also worry because | is a heavily used operator in my codebase (for dictionary unions & logical ORs)

MYK
  • 1,988
  • 7
  • 30

0 Answers0