2

Given any python object, is there a way to print out its structure so you can understand how to replace it with an object of equivalent structure?

Context: I've been trying to customise this Tensorflow example code so it works with my own data. I seem to be falling down on line 163 at the point where batch[0] and batch[1] are passed to the accuracy.eval() function. I think that batch will be a tuple of two different data types, but I'm really not sure what the data types of these two items are, nor how they relate to basic python types (e.g. I think they should both be numpy ndarray types of different sizes, but does that make them equivalent to lists of lists...?)

I imagine I could sprinkle several print(type(batch[0])) statements to find the answer... across as many edits as my data types are deep. Is there code that will reveal this structure in a single hit, regardless of the data types involved?

Banana
  • 1,149
  • 7
  • 24
omatai
  • 3,448
  • 5
  • 47
  • 74

1 Answers1

2

As far as I know there is no built-in method to learn about the inner structure of some object in Python, but somebody correct me if I'm wrong. I'm assuming you only care about the type nesting structure, not the array lengths or anything, although it should be easily possible to add them to the output.

Recursive solution

You might want to write a recursive function that terminates after a given recursion depth. I picked up something similar a while ago somewhere and changed it a little. This gets ugly really fast, but in principle works fine. However, you have to add layers of object that you want to recurse into yourself.

import numpy as np
def type_recurse(obj, depth=0):
    if depth > 20:
        return "..."
    if isinstance(obj, tuple):
        return "<class 'tuple': <" + ", ".join(type_recurse(inner, depth+1) for inner in obj) + ">>"
    elif isinstance(obj,list):
        return "<class 'list': " + (type_recurse(obj[0], depth+1) if obj else '(empty)') + ">"
    elif isinstance(obj,dict):
        return "<class 'dict': " + ", ".join(type_recurse(key, depth+1)+":"+type_recurse(val,depth+1) for key,val in obj.items()) + ">"
    elif isinstance(obj,np.ndarray):
        return "<class 'np.ndarray': " +", ".join(type_recurse(inner, depth+1) for inner in obj) + ">"
    else:
        return str(type(obj))

if __name__ == "__main__":
    a= (1,1)
    b= (1,1)
    c= (a,b)
    f = {"oh":c}
    g = np.array([1,"ah",f])
    print(type_recurse(g))

Also note that indeed list and array are recognized as different types.

Iterative solution

Came up with another way: Iterate your object, collecting the types, as long as there is any iterable structure in the current object. I did some formatting that it looks similar to the above, but you can change that for your needs. Also, if you need lengths and such (because you said replace with the exact same object structure), it should not be a problem to append them any time one of the iterative layers opens. The len() method works for arrays, tuples and lists, so a simple +str(len(value)) in the try statement should work.

def iter_type(iterable):
    """ prints the internal type structure of any iterable object """
    iterator, sentinel, stack = iter(iterable), object(), []
    fulltype = str(type(iterable))[0:-1]+": " # open highest layer
    while True:
        value = next(iterator, sentinel)
        if value is sentinel:
            if not stack:
                fulltype = fulltype[0:-3]+">" # neglect last comma and close highest layer
                break
            fulltype = fulltype[0:-3]+"> , " # iterator closes
            iterator = stack.pop()
        else:
            try:
                new_iterator = iter(value)
                fulltype += str(type(value))[0:-1]+ ": " # open iterable type layer, neglect last comma
            except TypeError: # non-iterable values
                fulltype += str(type(value))+" , " 
            else:
                stack.append(iterator)
                iterator = new_iterator
    return fulltype 

I can't guarantee for full generality, you'll need to test it. But the following example:

if __name__ == "__main__":
    print(iter_type([1,2,np.array([2,3]), [1,2],2]))

prints:

<class 'list': <class 'int'> , <class 'int'> , <class 'numpy.ndarray': <class 'numpy.int64'> , <class 'numpy.int64'>> , <class 'list': <class 'int'> , <class 'int'>> , <class 'int'>>

As expected. Involving tuples also worked fine for me.

Further remark In case the your iterables are expected to contain many elements, it might be smarter to print the object type and its number instead of all element types of the iterable object, maybe that change is not so hard to implement, but I haven't tried as I never needed it.

Banana
  • 1,149
  • 7
  • 24