Convert custom class to standard Python type

Question

I was working with a numpy array called predictions. I was playing around with the following code:

print type(predictions)
print list(predictions)

The output was:

<type 'numpy.ndarray'>`
[u'yes', u'no', u'yes', u'yes', u'yes']

I was wondering how numpy managed to build their ndarray class so that it could be converted to a list not with their own list function, but with the standard Python function.

Python version: 2.7, Numpy version: 1.9.2

Note that `list(whatever)` will work if `whatever.__iter__` is implemented - i.e. Python will consume the iterator and put each resulting object into a list. — jonrsharpe, Jul 23 '15 at 16:33

jonrsharpe · Accepted Answer · 2015-07-23T16:44:24.497

I have answered from the pure Python perspective below, but numpy's arrays are actually implemented in C - see e.g. the array_iter function.

The documentation defines the argument to list as an iterable; new_list = list(something) works a little bit like:

new_list = []
for element in something:
    new_list.append(element)

(or, in a list comprehension: new_list = [element for element in something]). Therefore to implement this behaviour for a custom class, you need to define the __iter__ magic method:

>>> class Demo(object):
    def __iter__(self):
        return iter((1, 2, 3))


>>> list(Demo())
[1, 2, 3]

Note that conversion to other types will require different methods.

score 2 · Answer 2 · edited May 23 '17 at 12:14

As others have written, list() works because an array is an iterable. It is equivalent to [i for i in arr]. To understand it you need to understand how iteration over an array works. In particular, list(arr) is not the same as arr.tolist().

In [685]: arr=np.array('one two three four'.split())

In [686]: arr
Out[686]: 
array(['one', 'two', 'three', 'four'], 
      dtype='<U5')

In [687]: ll=list(arr)

In [688]: ll
Out[688]: ['one', 'two', 'three', 'four']

In [689]: type(ll[0])
Out[689]: numpy.str_

In [690]: ll1=arr.tolist()

In [691]: ll1
Out[691]: ['one', 'two', 'three', 'four']

In [692]: type(ll1[0])
Out[692]: str

The print display of ll and ll1 looks the same, but the type of the elements is different, one is a str, the other a str wrapped in a numpy class. That distinction has come up in a recent question about serializing an array.

The distinction becomes more obvious when arr is 2d. Simple interation then produces the rows, not the elements:

In [693]: arr=np.reshape(arr,(2,2))

In [694]: arr
Out[694]: 
array([['one', 'two'],
       ['three', 'four']], 
      dtype='<U5')

In [695]: list(arr)
Out[695]: 
[array(['one', 'two'], 
       dtype='<U5'), array(['three', 'four'], 
       dtype='<U5')]

In [696]: arr.tolist()
Out[696]: [['one', 'two'], ['three', 'four']]

list(arr) is now two arrays, while arr.tolist() is a nested list.

Python Pandas: use native types

Why does json.dumps(list(np.arange(5))) fail while json.dumps(np.arange(5).tolist()) works

Convert custom class to standard Python type

2 Answers2