I am iterating through a large dataframe with multiindex using iterrows. The result is a Series with multiindex. After some profiling, it turned out that most of the time is spent on getting the cell value for the series, so I would like to use the Series.at function, as it is much faster. Unfortunately I haven't found anything in the pandas documentation about this with multiindex.
Here is a simple code:
import numpy as np
import pandas as pd
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'], ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.Series(np.random.randn(8), index=index)
>>>>s
first second
bar one -0.761968
two 0.670786
baz one -0.193843
two -0.251533
foo one 1.732875
two 0.538561
qux one -1.111480
two 0.478322
dtype: float64
I have tried s.at[("bar","one")] , s.at["bar","one"], but non of these works.
>>>>s.at[("bar","one")]
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Python\lib\site-packages\pandas\core\indexing.py", line 2270, in __getitem__
return self.obj._get_value(*key, takeable=self._takeable)
TypeError: _get_value() got multiple values for argument 'takeable'
>>>>s.at["bar","one"]
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Python\lib\site-packages\pandas\core\indexing.py", line 2270, in __getitem__
return self.obj._get_value(*key, takeable=self._takeable)
TypeError: _get_value() got multiple values for argument 'takeable'
Does anyone have any idea how to use .at in this case?