2

My goal is to fetch the list classes defined in a given Python file.

Following this link, I have implemented the following:

File b.py:

import imp
import inspect

module = imp.load_source("__inspected__", './a.py')
class_members = inspect.getmembers(module, inspect.isclass)
for cls in class_members:
    class_name, class_obj = cls
    member = cls[1]
    print(class_name)

File a.py:

from c import CClass


class MyClass:
    name = 'Edgar'

    def foo(self, x):
        print(x)

File c.py:

c_var = 2

class CClass:
   name = 'Anna'

I have two issues with this implementation. First, as is mentioned in the post, classes of imported module are printed out as well. I can't understand how to exclude them Second, looks like the imp file is depreciated in favour of importlib, but the doc seems sketchy. And I can't figure out how to refactor my solution. Any hints ?

Edgar Navasardyan
  • 4,261
  • 8
  • 58
  • 121

1 Answers1

2

So to use importlib similarly to how you're using imp, you can look at this: Python 3.4: How to import a module given the full path? and you get something like the following:

import importlib.machinery
import inspect

module = importlib.machinery.SourceFileLoader("a", './a.py').load_module()
class_members = inspect.getmembers(module, inspect.isclass)

Solution #1: Look up class statements in the Abstract Syntax Tree (AST).

Basically you can parse the file so that you can get the class declaration statements.

import ast

def get_classes(path):
    with open(path) as fh:        
       root = ast.parse(fh.read(), path)
    classes = []
    for node in ast.iter_child_nodes(root):
        if isinstance(node, ast.ClassDef):
            classes.append(node.name)
        else: 
            continue
    return classes
    
for c in get_classes('a.py'):
    print(c)

Solution #2: Look at imports and ignore import from statements.

This is more in-line with your current approach, but is a little jankier. You can look for things imported by the file you're looking at and select out the import from statements (Python easy way to read all import statements from py module) and just make sure that none of the things imported show up later:

import ast
from collections import namedtuple

Import = namedtuple("Import", ["module", "name", "alias"])

def get_imports(path):
    with open(path) as fh:        
       root = ast.parse(fh.read(), path)

    for node in ast.iter_child_nodes(root):
        if isinstance(node, ast.Import):
            # We ignore direct imports
            continue
        elif isinstance(node, ast.ImportFrom):  
            module = node.module.split('.')
        else:
            continue
        for n in node.names:
            yield Import(module, n.name.split('.'), n.asname)

imported = set()
for imp in get_imports('a.py'):
    imported_classes.add(imp.name[0] if not imp.alias else imp.alias)

Then you can just filter out the imported things you saw.

for c in class_members:
    class_name, class_obj = c
    member = c[1]
    if class_name not in imported:
        print(class_name)

Note that this currently doesn't distinguish between imported classes and imported functions, but this should work for now.

Andrew Wei
  • 870
  • 1
  • 7
  • 12
  • Andrew, I know that this is beyond the asked question, still can you tell me please going for the first approach with AST, how can I fetch the class itself by its name without having to load the module ? – Edgar Navasardyan Oct 18 '21 at 13:32
  • Python has a module that processes Python's abstract syntax grammar (so like the structure of the file as it relates to the Python programming language). You just give it a file and it processes the file. This means that it'll read all the things like class definitions, variable instantiations, imports, and whatnot. We care about the `ClassDef`s when they show up because those are the classes that are defined in the module. You can read more about the documentation here: https://docs.python.org/3/library/ast.html – Andrew Wei Oct 18 '21 at 15:28