If I have a data frame and rename the column, I am unable to access the column by its new name.
See example for illustration :
import pandas as pd
df = pd.DataFrame({'a':[1,2], 'b': [10,20]})
df
a b
0 1 10
1 2 20
df['a']
0 1
1 2
Now if I rename column 'a' as in a manner suggested here.
df.columns.values[0] = 'newname'
df
newname b
0 1 10
1 2 20
Now let's try to access the column using the 'newname'
df['newname']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/gpfs0/export/opt/anaconda-2.3.0/lib/python2.7/site-packages/pandas/core/frame.py", line 1797, in __getitem__
return self._getitem_column(key)
File "/gpfs0/export/opt/anaconda-2.3.0/lib/python2.7/site-packages/pandas/core/frame.py", line 1804, in _getitem_column
return self._get_item_cache(key)
File "/gpfs0/export/opt/anaconda-2.3.0/lib/python2.7/site-packages/pandas/core/generic.py", line 1084, in _get_item_cache
values = self._data.get(item)
File "/gpfs0/export/opt/anaconda-2.3.0/lib/python2.7/site-packages/pandas/core/internals.py", line 2851, in get
loc = self.items.get_loc(item)
File "/gpfs0/export/opt/anaconda-2.3.0/lib/python2.7/site-packages/pandas/core/index.py", line 1572, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
KeyError: 'newname'
Yet I can still access the column by the old name.
df['a']
0 1
1 2
Name: a, dtype: int64
It seems that I've changed the nominal name of the column, yet that change did not propagate through to the dictionary used to deference columns in the data frame structure.
QUESTION : Why is this behavior happening and how do I fix it?