Iterating over (X) elements at a time to end of index [3]

Question

I'm looking for a Pythonic way to iterate through a dataframe's index such that I can chunk a computationally heavy list into a series of smaller lists to run. The output of each chunk will be appended to a CSV in order to avoid resource limits.

For example, if I have some list who's length is prime, I'd like to split that list into a number of lists of relatively equal length, run the computations against that set, and and append the output of that set to a CSV. Rinse and repeat all the way down the index of the dataframe until all of the rows have been run against.

e.g.

Run some function on the first 10,000 - store in csv
Run on the 10,001 - 20,000 row - store in csv
.....
Run through row 111,376 - store in csv
end.

Have you tried [iterating over slices of the frame](https://stackoverflow.com/a/1335456/364696)? — ShadowRanger, Dec 27 '18 at 01:44
If you are working on `pandas.DataFrame` and worried about the resource limits, you can try passing `iterator=True` when you read your csv, and do `yourDF.get_chunk(n)` where n is your desired number of rows. — Chris, Dec 27 '18 at 02:21
Thanks to all! @ShadowRanger I was able to use the Chunker object classification to iterate over slices successfully. — broseidon, Dec 27 '18 at 14:56
@Chris chunking on the way in helped immensely, I'm now below 50% of my memory usage running this script. Thank you. — broseidon, Dec 27 '18 at 14:56

score 0 · Accepted Answer · answered Dec 27 '18 at 16:03

0

Per @ShadowRanger:

Have you tried iterating over slices of the frame? - Iteration over list slices

answered Dec 27 '18 at 16:03

broseidon

85
1
7

Iterating over (X) elements at a time to end of index [3]

1 Answers1