1

I'm cleaning some smeared data for which I want to automate things a bit. That is, I want a script to have some predefined cleaning functions put in order of how the data should be cleaned, and I designed a decorator to retrieve these functions from a script using this solution:

from inspect import getmembers, isfunction
import cd # cleaning module
functions_list = [o[0] for o in getmembers(cd) if isfunction(o[1])]

This works extremely good. However, it does retrieve the functions in a different order (by name)

For reproducibility purposes, consider the following cleaning module as cd:

def clean_1():
    pass


def clean_2():
    pass


def clean_4():
    pass


def clean_3():
    pass

The solution outputs:

['clean_1', 'clean_2', 'clean_3', 'clean_4']

Where it needs to be:

['clean_1', 'clean_2', 'clean_4', 'clean_3']

Other solutions to the main problem are acceptable (performance is considered though).

Chris Larson
  • 1,684
  • 1
  • 11
  • 19
ndrwnaguib
  • 5,623
  • 3
  • 28
  • 51

2 Answers2

2

You're half way there. You only need to sort the list based on the 1st line of the function's code object ([Python 3]: inspect - Inspect live objects).

Note that I've only tried this on the (trivial) example from question (and didn't do any performance tests).

code.py:

#!/usr/bin/env python3

import sys 
from inspect import getmembers, isfunction
import cd  # The module from the question that contains the 4 clean_* functions


def main():
    member_functions = (item for item in getmembers(cd) if isfunction(item[1]))
    function_names = (item[0] for item in sorted(member_functions, key=lambda x: x[1].__code__.co_firstlineno))
    print(list(function_names))


if __name__ == "__main__":
    print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    main()

Output:

e:\Work\Dev\StackOverflow\q054521087>"e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32

['clean_1', 'clean_2', 'clean_4', 'clean_3']
CristiFati
  • 38,250
  • 9
  • 50
  • 87
1

Other solutions to the main problem are acceptable (performance is considered though).

In the interest of being able to define and import helper functions without having them included automatically, how about an explicit list:

def clean_1():
    pass


def clean_2():
    pass


def clean_4():
    pass


def clean_3():
    pass


cleaners = [
    clean_1,
    clean_2,
    clean_4,
    clean_3,
]

or an explicit decorator:

cleaners = []
cleaner = cleaners.append


@cleaner
def clean_1():
    pass


@cleaner
def clean_2():
    pass


@cleaner
def clean_4():
    pass


@cleaner
def clean_3():
    pass

As far as getting the attributes of a regular module in order goes, though, you should be able to use __dict__ in Python 3.7+:

functions_list = [k for k, v in cd.__dict__.items() if isfunction(v)]
Ry-
  • 218,210
  • 55
  • 464
  • 476