As far as I know there is no built-in method to learn about the inner structure of some object in Python, but somebody correct me if I'm wrong. I'm assuming you only care about the type nesting structure, not the array lengths or anything, although it should be easily possible to add them to the output.
Recursive solution
You might want to write a recursive function that terminates after a given recursion depth. I picked up something similar a while ago somewhere and changed it a little.
This gets ugly really fast, but in principle works fine. However, you have to add layers of object that you want to recurse into yourself.
import numpy as np
def type_recurse(obj, depth=0):
if depth > 20:
return "..."
if isinstance(obj, tuple):
return "<class 'tuple': <" + ", ".join(type_recurse(inner, depth+1) for inner in obj) + ">>"
elif isinstance(obj,list):
return "<class 'list': " + (type_recurse(obj[0], depth+1) if obj else '(empty)') + ">"
elif isinstance(obj,dict):
return "<class 'dict': " + ", ".join(type_recurse(key, depth+1)+":"+type_recurse(val,depth+1) for key,val in obj.items()) + ">"
elif isinstance(obj,np.ndarray):
return "<class 'np.ndarray': " +", ".join(type_recurse(inner, depth+1) for inner in obj) + ">"
else:
return str(type(obj))
if __name__ == "__main__":
a= (1,1)
b= (1,1)
c= (a,b)
f = {"oh":c}
g = np.array([1,"ah",f])
print(type_recurse(g))
Also note that indeed list and array are recognized as different types.
Iterative solution
Came up with another way: Iterate your object, collecting the types, as long as there is any iterable structure in the current object. I did some formatting that it looks similar to the above, but you can change that for your needs. Also, if you need lengths and such (because you said replace with the exact same object structure), it should not be a problem to append them any time one of the iterative layers opens. The len()
method works for arrays, tuples and lists, so a simple +str(len(value))
in the try
statement should work.
def iter_type(iterable):
""" prints the internal type structure of any iterable object """
iterator, sentinel, stack = iter(iterable), object(), []
fulltype = str(type(iterable))[0:-1]+": " # open highest layer
while True:
value = next(iterator, sentinel)
if value is sentinel:
if not stack:
fulltype = fulltype[0:-3]+">" # neglect last comma and close highest layer
break
fulltype = fulltype[0:-3]+"> , " # iterator closes
iterator = stack.pop()
else:
try:
new_iterator = iter(value)
fulltype += str(type(value))[0:-1]+ ": " # open iterable type layer, neglect last comma
except TypeError: # non-iterable values
fulltype += str(type(value))+" , "
else:
stack.append(iterator)
iterator = new_iterator
return fulltype
I can't guarantee for full generality, you'll need to test it. But the following example:
if __name__ == "__main__":
print(iter_type([1,2,np.array([2,3]), [1,2],2]))
prints:
<class 'list': <class 'int'> , <class 'int'> , <class 'numpy.ndarray': <class 'numpy.int64'> , <class 'numpy.int64'>> , <class 'list': <class 'int'> , <class 'int'>> , <class 'int'>>
As expected. Involving tuples also worked fine for me.
Further remark In case the your iterables are expected to contain many elements, it might be smarter to print the object type and its number instead of all element types of the iterable object, maybe that change is not so hard to implement, but I haven't tried as I never needed it.