Nested loop over dataframe rows

Question

I would like to perform a nested loop over a dataframe rows, considering the fact the inner loop starts from outer_row + 1. If I use

for o_index, o_row in df.iterrows():
    L1 = o_row['Home']
    L2 = o_row['Block']
    for i_index, i_row in df.iterrows():
        L3 = i_row['Home']
        L4 = i_row['Block']

As you can see, in the first iteration, i_index is the same as o_index. However, I want o_index to be 0 and i_index to be 1. How can I do that?

Example: Assume a dataframe like this:

     Cycle      Home     Block
0     100       1         400
1     130       1         500
2     200       2         200
3     300       1         300
4     350       3         100

The iterations should be in this order:

0 -> 1, 2, 3, 4

1 -> 2, 3, 4

2 -> 3, 4

3 -> 4

4 -> nothing

In each inner iteration, I will then compare L1 and L3 and if they are equal, then abs(L2-L4) is calculated and pushed in a list.

What exactly are your trying to achieve? It is almost certain that using nested `iterrows` loops is the **wrong** approach. Please give concrete examples. — mozway, Jan 11 '23 at 13:11
Ideally, [don't loop at all](https://stackoverflow.com/a/55557758/15873043). what are you planning to do with the values? You can use `.shift()` to offset rows and process them all at once. — fsimonjetz, Jan 11 '23 at 13:12
@mahmood can you now give an example of the **operation** you will perform? — mozway, Jan 11 '23 at 13:20
@mahmood I have seen it, but you don't explain what you are computing. Can it be vectorized? — mozway, Jan 11 '23 at 13:25
I see, then it can be greatly simplified to a computation of combinations per group. See my answer. — mozway, Jan 11 '23 at 13:36

score 2 · Accepted Answer · answered Jan 11 '23 at 13:35

No need for iteration with testing, what you want to do is to compute the combinations of Block for the same Home. So just do that:

from itertools import combinations

out = [abs(L2-L4) for _, g in df.groupby('Home')
       for L2, L4 in combinations(g['Block'], r=2)]

Output:

[100, 100, 200]

score 1 · Answer 2 · answered Jan 11 '23 at 13:13

1

For your specific problem i guess df.iterrows() may not be optimal... You should consider just iterating using indicies and df.iloc.

for o_index in range(len(df)):
    o_row = df.iloc[o_index]
    L1 = o_row['Home']
    L2 = o_row['Block']
    for i_index in range(o_index + 1, len(df)):
        i_row = df.iloc[i_index]
        L3 = i_row['Home']
        L4 = i_row['Block']

Otherwise if you really want to use df.iterrows() this solution should work:

for o_index, o_row in df.iterrows():
    L1 = o_row['Home']
    L2 = o_row['Block']
    for i_index, i_row in df.iloc[o_index+1:].iterrows():
        L3 = i_row['Home']
        L4 = i_row['Block']

answered Jan 11 '23 at 13:13

Matteo Zanoni

3,429
9
27

1

Without knowing what OP wants to do it's also not unlikely that loops are still a bad approach ;) – mozway Jan 11 '23 at 13:14
@mozway I agree with you, but the question is titled `Nested loop over dataframe rows` so I wanted to provide an answer for that. The rest is out of the scope of this question I guess – Matteo Zanoni Jan 11 '23 at 13:17
I wrote an example. Your proposed method seems good. Thank you. – mahmood Jan 11 '23 at 13:18

Nested loop over dataframe rows

2 Answers2