This is to be expected with pandas. Doing single operations in pandas like this is bound to be slow, there are several Python layers you have to get to to even begin to do the actual "work", e.g. several Python-level functions. Here's the source code
return _AtIndexer("at", self)
Note, it's part of a mixin, which will slow things down already. But more importantly, it returns an instance of _AtIndexer
, which is some other object
That is when you finally get to a __getitem__
method, but even then, look what it needs to do:
def __getitem__(self, key):
if self.ndim == 2 and not self._axes_are_unique:
# GH#33041 fall back to .loc
if not isinstance(key, tuple) or not all(is_scalar(x) for x in key):
raise ValueError("Invalid call for scalar access (getting)!")
return self.obj.loc[key]
return super().__getitem__(key)
So it does a Python-level conditional, but the standard case is actually just
return super().__getitem__(key)
So, another Python-level function call. What does it do? Here's the source code
def __getitem__(self, key):
if not isinstance(key, tuple):
# we could have a convertible item here (e.g. Timestamp)
if not is_list_like_indexer(key):
key = (key,)
else:
raise ValueError("Invalid call for scalar access (getting)!")
key = self._convert_key(key)
return self.obj._get_value(*key, takeable=self._takeable)
Soooo even more Python-level conditionals, some other python level method calls...
We can keep digging, to see what self.obj._get_value
does, which is implemented in yet some other base class, but I think you understand the point by now.
In a list/dict, once you get passed the intial method resolution, you are in the C-layer, and it does all the work there. In pandas, you do a ton of overhead in the Python layer, before it get's pushed, eventually, hopefully, into numpy
, where the speed of the bulk operations occurs. It has no hope of beating the operations done on a built-in python data structure though, which are much thinner wrappers around compiled code. The performance of pandas dies a death by a thousand cuts of various method calls, internal bookeeping logic done in Python,
EDIT: I noticed, I actually went through the __getitem__
logic, but the point still stands for __setitem__
, indeed, the intermediate steps seem to involve even more work.