0

I have a Pandas DataFrame with the shape - (133, 6) - and I am trying to use iloc to iterate down the frame then select and append chunks of data to a list.

I have a list of start positions:

start_positions = [6, 32, 58, 84, 110]

and the end positions are +7 rows, so I have tried this:

frames = []
for x in start_positions:
    frames.append(df.iloc[start_positions[x] : start_positions[x]+7])

However is throwing:

IndexError: index 6 is out of bounds for axis 0 with size 5

Which I don’t quite understand as this works if I manually increment through start_positions.

Maverick
  • 789
  • 4
  • 24
  • 45
  • What are you trying to achieve out of this code @Maverick? – Vishnudev Krishnadas Jan 21 '20 at 09:21
  • Each of the chunks represents a year of data and each column represents a different category. So the aim is to pull out the data and store it in separate locations corresponding to the year. – Maverick Jan 21 '20 at 09:26
  • `for x in start_positions:` x are *not indices*! You use `start_positions[x]` , thus you see your error. Forget the dataframe, your issue occurs with you trying to index the list itself. – Paritosh Singh Jan 21 '20 at 09:32
  • Does this answer your question? [Python Loop: List Index Out of Range](https://stackoverflow.com/questions/37619848/python-loop-list-index-out-of-range) – Paritosh Singh Jan 21 '20 at 09:35

3 Answers3

1

The code has problem right from the start in the for loop I think. Look at start_positions[x] in frames.append(df.iloc[start_positions[x] : start_positions[x]+7]). The value of x in the for loop starts from 6 but the maximum index start_positionscan have is 4 since len(start_positions)=5

@Maverick , I think what you may want to do is remove start_positions and have something like this (but didnt test the code)

for x in start_positions:
    frames.append(df.iloc[x : x+7])
eddys
  • 1,147
  • 1
  • 8
  • 14
1

Try to use for x in range(len(start_positions)) instead of for x in start_positions as:

frames = []
for x in range(len(start_positions)):
    print(start_positions[x],":",start_positions[x]+7)

It results:

6 : 13
32 : 39
58 : 65
84 : 91
110 : 117
nucsit026
  • 652
  • 7
  • 16
0

Another possible solution would be to use:

frames = []
for x in start_positions:
    frames.append(df.iloc[x:x+7])

x is an element of start_positions and can be accessed as is, had it been an index it would be used the way you did.

Vishakha Lall
  • 1,178
  • 12
  • 33