2

Let's say you have a particular method (function) from a particular module (of a particular class, optional). Is it possible via introspection of the library source code print all the lines where that method is called (used)? It can be called internally (with self.method_name()) or externally with (object1.method_name() in source file 1, object2.method_name() in source file 2, ... and objectN.method_name() in source file N).

An example could be shown on re module and it's method re.findall.

I tried to print the lines with grep, but this is a problem for methods having the same name (e.g. I tried it with method named connect(), but 24 classes have a method named connect... I'd like to filter this for particular class (and/or module).

xralf
  • 3,312
  • 45
  • 129
  • 200
  • 1
    Sounds like a XY problem. Why do you need it? Maybe there's a better way to solve the real issue. – MSeifert May 01 '17 at 22:16
  • Maybe [Better way to log method calls in Python?](http://stackoverflow.com/questions/5103735/better-way-to-log-method-calls-in-python) – wwii May 01 '17 at 22:24
  • @wwii To understand it right. It's not a process property, but a source code property. It is not needed for the library to run. I'm doing code review. – xralf May 01 '17 at 22:27
  • @MSeifert code review. – xralf May 01 '17 at 22:27
  • Code Review isn't a issue in itself. What exactly is the reason that you think you'll need to find the places where a method is called? – MSeifert May 01 '17 at 22:38
  • @MSeifert I haven't said it's an issue, it's rather hobby. – xralf May 01 '17 at 22:42
  • 1
    If you have 24 classes with method `connect`, then you have to write something more sophisticated than grep to consider `self.connect` a hit only within `grep` or equivalent. Similarly, if you import `classx.connect` with different names in different files, you will have to write something to determine its name within each file. If you seriously try writing code and it fails, then you can post and ask a specific question. Otherwise this is too broad a question. – Terry Jan Reedy May 01 '17 at 22:50
  • @MSeifert (I experiment with readability and similar things)... [Here](https://blog.suhas.org/parsing-python-abstract-syntax-trees-4ed0566dc2e2) is the file function_calls_ast.py. I'd like the output a little diferrent (list of tuples (method_call, "line number in the source file", "source file name", class and/or module the method called belongs to)) – xralf May 01 '17 at 22:52
  • @TerryJanReedy Maybe you're right. I go to bed and ask more specific question tommorrow or in a few days. – xralf May 01 '17 at 22:56

4 Answers4

3

I grep for function usage fairly regularly. Fortunately, I have never been interested in something that would be duplicated so heavily.

Here is what I might do, rather than write one-time code, if the false hits for Class.method were too common to filter manually. First grep for class Class to find the module with the class definition and note the range of lines. Then grep that module for self.method and delete or ignore hits outside that range. Then grep all modules of interest for import module and from module to find modules that might use the class and method. Then grep groups of modules depending on the specific form of import.

As others have pointed out, even this will miss calls that use aliases for the method name. But only you can know if this is an issue for your scenario. It is not, that I know of, for what I have done.

An entirely different approach, not depending on names, is to instrument the function with code that logs its calls, after using dynamic introspection to determine the caller. (I believe there are SO Q&As about this.)

Terry Jan Reedy
  • 18,414
  • 3
  • 40
  • 52
2

I am adding this as another answer because the code is too big to cramp all together in the first one.

This is very simplistic example for finding out which function called which using abstract syntax tree.

To apply this on objects you have to stack when you enter them, then jump to their class and upon encountering a call to a function say that it is called from that particular object.

You can see how complicated this becomes when modules get involved. Each module should be entered and its submodules and all functions mapped so that you can track calls to them and so on.



import ast

def walk (node):
    """ast.walk() skips the order, just walks, so tracing is not possible with it."""
    end = []
    end.append(node)
    for n in ast.iter_child_nodes(node):
        # Consider it a leaf:
        if isinstance(n, ast.Call):
            end.append(n)
            continue
        end += walk(n)
    return end

def calls (tree):
    """Prints out exactly where are the calls and what functions are called."""
    tree = walk(tree) # Arrange it into our list
    # First get all functions in our code:
    functions = {}
    for node in tree:
        if isinstance(node, (ast.FunctionDef, ast.Lambda)):
            functions[node.name] = node
    # Find where are all called functions:
    stack = []
    for node in tree:
        if isinstance(node, (ast.FunctionDef, ast.Lambda)):
            # Entering function
            stack.append(node)
        elif stack and hasattr(node, "col_offset"):
            if node.col_offset<=stack[-1].col_offset:
                # Exit the function
                stack.pop()
        if isinstance(node, ast.Call):
            if isinstance(node.func, ast.Attribute):
                fname = node.func.value.id+"."+node.func.attr+"()"
            else: fname = node.func.id+"()"
            try:
                ln = functions[fname[:-2]].lineno
                ln = "at line %i" % ln
            except: ln = ""
            print "Line", node.lineno, "--> Call to", fname, ln
            if stack:
                print "from within", stack[-1].name+"()", "that starts on line", stack[-1].lineno
            else:
                print "directly from root"

code = """
import os

def f1 ():
    print "I am function 1"
    return "This is for function 2"

def f2 ():
    print f1()
    def f3 ():
        print "I am a function inside a function!"
    f3()
f2()
print "My PID:", os.getpid()
"""

tree = ast.parse(code)

calls(tree)

The output is:

Line 9 --> Call to f1() at line 4
from within f2() that starts on line 8
Line 12 --> Call to f3() at line 10
from within f2() that starts on line 8
Line 13 --> Call to f2() at line 8
directly from root
Line 14 --> Call to os.getpid()
directly from root

Dalen
  • 4,128
  • 1
  • 17
  • 35
  • I think you should add that you can use a Visitor to pull the location and method calls inheriting from ast.NodeVisitor: ```python class FuncVisitor(ast.NodeVisitor): def visit_Call(self, node): # pylint: disable=C0103 print(f"Line {node.func.lineno} --> Call to {node.func.value.id}.{node.func.attr}()") ``` – thoroc Jan 27 '22 at 14:53
1

You probably know, but I just can't risk that you don't know: Python is not a strongly typed language.

As such something like objectn.connect() doesn't care what objectn is (it could be a module, a class, a function that aquired an attribute, ...). It also doesn't care if connect is a method or if it's a class that happens to be callable or a factory for functions. It would happily accept any objectn that somehow gives back a callable when you try to get the attribute connect.

Not only that, there are lots of ways to call methods, just assume something like:

class Fun(object):
    def connect(self):
        return 100

objectn = Fun()

(lambda x: x())(getattr(objectn, '{0}t'.format('co' + {0:'nnec'}[0])))

There's no way you could reliable search for objectn.connect() and match get a match for (lambda x: x())(getattr(objectn, '{0}t'.format('co' + {0:'nnec'}[0]))), but both do call the method connect of objectn.

So I'm very sorry to say that even with abstract syntax trees, (optional) annotations and static code analysis it will be (nearly?) impossible to find all places where a specific method of a specific class is called.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • OK. And what about to do it only for the subset, not all. And the places where we can't determine could have a record (function_name, line_number, source_file, "undetermined") – xralf May 01 '17 at 23:22
  • A subset of what? As I said it's not strongly typed so each place is "undetermined" by design. – MSeifert May 01 '17 at 23:27
  • You are right. I haven't realized this at the beginning. So, this seems I can only do it manually (maybe some machine learning or via some combination of code coverage - that will call everything it the source code and use the tools @Dalen suggested.(?) But, maybe this is not worth the time (It's attactive but I have to do something easier first) – xralf May 02 '17 at 13:21
  • @xralf If you're going to call everything in the source code, then just decorate the corresponding `connect` method and log the calls. :) That would be much easier and you don't have to go through the hassle of doing it with static analysis. – MSeifert May 02 '17 at 13:24
  • But if [ctags](http://stackoverflow.com/questions/1054701/get-ctags-in-vim-to-go-to-definition-not-declaration) knows WHERE a called method is defined, it knows the source file, module name and class name. So, if it could read the whole source code and test for every function call, all the calls that lead to the same line number with function definition, are the lines I'm looking for. – xralf May 02 '17 at 14:45
  • Oh, I'm sorry, it's manual process as I read the [link](http://stackoverflow.com/questions/1054701/get-ctags-in-vim-to-go-to-definition-not-declaration) I pasted. – xralf May 02 '17 at 14:47
  • @xralf Not sure I understand that. Does it work for python? – MSeifert May 02 '17 at 17:07
  • It works in a way, you have to choose from a list to which method in which file and line you want to jump to (if its definition is in more files and/or classes.) – xralf May 02 '17 at 20:50
1

You can use ast or compiler module to dig through the compiled code and find out places where functions are explicitly called.

You can also just compile code with compile() with ast flag and get it parsed as abstract syntax tree. Then you go see what is called where within it.

But you can track down all what is happening during the codes execution using some tricks from sys, inspect and traceback modules.

For example, you can set your trace function that will snatch each interpreter frame before letting it be executed:

import dis
import sys
def tracefunc (frame, evt, arg):
    print frame.f_code.co_filename, frame.f_lineno, evt
    print frame.f_code.co_name, frame.f_code.co_firstlineno
    #print dis.dis(f.f_code)
sys.settrace(tracefunc)

After this code, every step done will be printed with file that contains the code, line of the step, where the code object begins and it will disassemble it so you can see all that is being done or will be done in background too (if you uncomment it).

If you want to match executed bytecode with Python code, you can use tokenize module. You make a cache of tokenized files as they appear in trace and snatch the Python code out of the corresponding lines whenever needed.

Using all mentioned stuff you can do wanders including writing byte code decompiler, jumping all over your code like with goto in C, forcefully interrupting threads (not recommended if you don't exactly know what you are upto), track which function called your function (nice for streaming servers to recognize clients catching up their parts of stream), and all sorts of crazy stuff.

Advanced crazy stuff I have to say. DON'T GO MESSING code flow in SUCH A MANNER UNLESS IT IS ABSOLUTELY NECESSARY and you don't know EXACTLY what you are doing.

I'll get down voted just because I mentioned such things are even possible.

Example for dynamically detecting which instance of client() tries to get the content:

from thread import get_ident
import sys

class Distributer:
    def read (self):
        # Who called me:
        cf = sys._current_frames()
        tid = get_ident() # Make it thread safe
        frame = cf[tid]
        # Now, I was called in one frame back so
        # go back and find the 'self' variable of a method that called me
        # and self, of course, contains the instance from which I was called
        client = frame.f_back.f_locals["self"]
        print "I was called by", client

class Client:
    def __init__ (self, name):
        self.name = name

    def snatch (self):
        # Now client gets his content:
        content.read()

    def __str__ (self):
        return self.name

content = Distributer()
clients = [Client("First"), Client("Second"), Client("Third"), Client("Fourth"), Client("Etc...")]
for client in clients:
    client.snatch()

Now, you write this within the tracing function instead of fixed method, but cleverly, not relying on variable names but on addresses and stuff and you can track what happens when and where. Big job, but possible.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
Dalen
  • 4,128
  • 1
  • 17
  • 35
  • But this doesn't answer the question of **How** you can find the places where a specific method of a specific object is called. That's just evaluating the source code. It can probably help finding the places where a specific `xxx.connect()` is happening but you could achieve the same with `grep` for `".connect()"`, right? And that was already stated as "not sufficient" in the original post. – MSeifert May 02 '17 at 13:28
  • If you are tracing code execution you can get the object that called whatever from wherever. Current frame has f_back attribute, so you can go back through frames to the place where call is made. Its just a question of hacks you have to design to find proper places. If you are just analyzing using ast, you are achieving much better than grep. Because of the tree structure, you are following the code flow, so you know which object is called where. – Dalen May 02 '17 at 16:42
  • If you emulate the Python virtual machine while traversing the ast and keep separate stacks of traces for each object you enter you can do it. It is hard, perhaps you'll need to examine bytecode instructions themselves for some confirmations, but it's not inpossible, just tricky and needs combining stuff to do it. – Dalen May 02 '17 at 16:52
  • Maybe you could add an example how you would do it with easy cases? Like "create object, call method in the same scope" or "create object pass it to a function where the method is called on the object"? – MSeifert May 02 '17 at 17:01
  • I did it. I Sincerely hope it helps. – Dalen May 02 '17 at 17:45
  • Now I'm confused. The question asks about static analysis and I thought you proposed to use AST and tokenize but the example just shows this as part of "real code execution". – MSeifert May 02 '17 at 17:56
  • I propose one of the two. To do the tracing within ast you emulate the PVM. Traversing the tree, find out name, code etc. i.e. construct something frame-like, so that you can go back to the caller. It's just that doing it on the fly while executing would be easier. The example shows that the called function can say who called it. So if you catch the frame when connect() is called, you go back and see from where it is called. So you find the class and instance and anything really. – Dalen May 02 '17 at 18:14
  • You just set up the tracing function and catch its frame. Real execution is easier and clearer as an example. Let say we want only connect. Then you can wait (the tracing function) for a frame that calls connect(), then go back and find its caller. AST may be similarly examined or some other method of search can be used. You are correct, of course, it is nearly inpossible to cover all cases. But nearly makes the day brighter. PVM does it, of course, so good emulation will work. – Dalen May 02 '17 at 18:28
  • @MSeifert I added simplistic example with AST as another answer. – Dalen May 02 '17 at 21:57