Iterating over rows in a dataframe in Pandas: is there a difference between using df.index and df.iterrows() as iterators?

Question

When iterating through rows in a dataframe in Pandas, is there a difference in performance between using:

for index in df.index:
    ....

And:

for index, row in df.iterrows():
    ....

? Which one should be preferred?

Related: [Best way to iterate through elements of pandas Series](https://stackoverflow.com/a/68717336/13138364) — tdy, Dec 04 '21 at 18:26

Nicolai B. Thomsen · Answer 1 · 2021-12-04T17:45:07.737

Pandas is significantly faster for column-wise operations so consider transposing your dataset and carrying out whatever operation you want. If you absolutely need to iterate through rows and want to keep it simple, you can use

for row in df.itertuples():
    print(row.column_1)

df.itertuples is significantly faster than df.iterrows() and iterating over the indices. However, there are faster ways to perform row-wise operations. Check out this answer for an overview.

score 2 · Accepted Answer · answered Dec 04 '21 at 17:31

When we doing for loop , look up index get the data require additional loc

for index in df.index:
    value = df.loc['index','col']

When we do df.iterrows

for index, row in df.iterrows():
    value = row['col']

Since you already with pandas , both of them are not recommended. Unless you need certain function and cannot be vectorized.

However, IMO, I preferred df.index

Iterating over rows in a dataframe in Pandas: is there a difference between using df.index and df.iterrows() as iterators?

2 Answers2