We have both code popping up in our codebase
pandas.DataFrame.columns.values.tolist()
pandas.DataFrame.columns.tolist()
Are these always identical? I'm not sure why the values
variant pops up in the places it does, seems like the direct columns.tolist()
is all that's needed to get the column names. I'm looking to clean up the code a bit if this is the case.
Introspecting a bit seems to suggest values is just some implementation detail being a numpy.ndarray
>>> import pandas
>>> d = pandas.DataFrame( { 'a' : [1,2,3], 'b' : [0,1,3]} )
>>> d
a b
0 1 0
1 2 1
2 3 3
>>> type(d.columns)
<class 'pandas.core.indexes.base.Index'>
>>> type(d.columns.values)
<class 'numpy.ndarray'>
>>> type(d.columns.tolist())
<class 'list'>
>>> type(d.columns.values.tolist())
<class 'list'>
>>> d.columns.values
array(['a', 'b'], dtype=object)
>>> d.columns.values.tolist()
['a', 'b']
>>> d.columns
Index(['a', 'b'], dtype='object')
>>> d.columns.tolist()
['a', 'b']