I´m using Pandas for almost 6 months, and in my view, one of the greatest debates has been about iterating dataframes, through .iterrows()
.apply()
or list-comprehension
to compute new data.
I was oriented many times, always when possible, to use .loc
or similar accessors to write data. The problem is, when I have many conditionals, what I used to solve in one line code, I´ll need to create many lines of .iloc
to fulfill data.
In a nut shell: does it pay-off to always avoid iteration and have a much longer code lines, even when the dataframes are not huge?
Does anybody recommend some articles that explain this efficiency trade-off?