2

Going on this question of mine, my goal now is to parse a Python file, and to

  1. Extract all classes
  2. Extract the list of its attributes and list of bases classes

Without loading the file (running it).

Currently, I have this working code:

parser.py

import ast

def get_classes(path):
    with open(path) as fh:        
       root = ast.parse(fh.read(), path)
    classes = []
    for node in ast.iter_child_nodes(root):
        if isinstance(node, ast.ClassDef):
            classes.append(node.name)
        else: 
            continue
    return classes
    
for c in get_classes('a.py'):
    print(c)

File to be parsed:

from c import CClass
    
class MyClass(UndefinedClass):
    name = 'Edgar'

    def foo(self, x):
        print(x)


def func():
    print('Hello')

The good part of this solution is that I get the list of class names even given that file a.py contains invalid python code. Looks like I have to dig deeper into AST module. Is there any way I can extract the list of class attributes and its base classes ?

martineau
  • 119,623
  • 25
  • 170
  • 301
Edgar Navasardyan
  • 4,261
  • 8
  • 58
  • 121

2 Answers2

1

You can use recursion to traverse the ast produced by ast.parse. The solution below performs this search not only in the main input file but in any subsequent imported files as well:

import ast, importlib
class Parse:
   def __init__(self):
      self.c = {}
   def walk(self, tree, f = None):
      if isinstance(tree, ast.ClassDef):
         self.c[tree.name] = {'bases':[i.id for i in tree.bases], 'attrs':[]}
         for i in tree.body:
             self.walk(i, tree.name)
      elif isinstance(tree, (ast.ImportFrom, ast.Import)):
         for i in (k if isinstance((k:=getattr(tree, 'module', tree.names)), list) else [k]):
             with open(importlib.machinery.PathFinder().find_module(getattr(i, 'name', i)).get_filename()) as f:
                self.walk(ast.parse(f.read()))
      elif isinstance(tree, ast.Assign) and f is not None:
         self.c[f]['attrs'].append(tree.targets[0].id)
      else:
         for i in getattr(tree, '_fields', []):
            for j in (k if isinstance((k:=getattr(tree, i)), list) else [k]):
               self.walk(j, None)

Putting it all together with your two original files:

File c.py:

c_var = 2

class CClass:
   name = 'Anna'

File a.py:

from c import CClass
    
class MyClass(UndefinedClass):
    name = 'Edgar'

    def foo(self, x):
        print(x)


def func():
    print('Hello')
p = Parse()
with open('a_mod_test.py') as f:
   p.walk(ast.parse(f.read()))

print(p.c)

Output:

{'CClass': {'bases': [], 'attrs': ['name']}, 'MyClass': {'bases': ['UndefinedClass'], 'attrs': ['name']}}
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
0

A ClassDef node has a bases attribute which is a list of nodes representing the base classes for the class. It also has a body attribute which is a list of nodes representing the body of the class definition. I suppose you want the Assign nodes in the body, but maybe you mean something slightly different by class attributes.

https://docs.python.org/3/library/ast.html#ast.ClassDef

Kyle Parsons
  • 1,475
  • 6
  • 14
  • Yes, I guess this is ClassDef. Can you provide a working example of how can I loop through all classes and get the list of bases ? The doc is extremely sketchy on this... ( – Edgar Navasardyan Oct 18 '21 at 14:10