I've been working with numpy
and pandas
for a long time but I'm still usually confused by the concept of doing an operation along an axis.
For example, if I have a data of shape [200,5], and I want to find the mean with the resulting shape [1,5], I would first call data.mean(axis=0)
, and if it doesn't work, I would try data.mean(axis=1)
.
Turns out, axis=0
is correct in this case. But I don't have good terminology for me to remember which axis to use.
Currently, I think that whatever axis I want to reduce the shape to 1, I will need to apply the operation on that axis.
This works fine for Reduction operation like mean
, sum
or std
.
But I don't know how to think when I would like to apply operations that do not reduce the shape like divide
, add
, sort
, etc. (For divide
and add
of different shapes, broadcasting is involved)
So it made me curious about how the guy who created pandas
and numpy
think intuitively about this. It made curious about what they exactly mean when they say "sorting along the row axis".
I want to understand it so clear that I know what results I'm going to expect when I call a certain axis!