I have a dataframe with 40 rows, and I want to iterate over it so I will have 4 iteration with 10 rows each, serially.
So group#0 will be rows 0-9 , group#1 will be rows 10-19 and so on.
How can I do it?
I have a dataframe with 40 rows, and I want to iterate over it so I will have 4 iteration with 10 rows each, serially.
So group#0 will be rows 0-9 , group#1 will be rows 10-19 and so on.
How can I do it?
import pandas as pd
import numpy as np
df1 = {
'State':['Arizona','Georgia','Newyork','Indiana','Florida'],
'Score1':[4,47,55,74,31]}
df1 = pd.DataFrame(df1,columns=['State','Score1'])
print(df1)
We need to add value (here 430) to the index to generate row number and the result is stored in a new column as shown below.
df1['New_ID'] = df1.index + 430
print(df1)
2 solutions from this stackoverflow question : How to iterate over consecutive chunks of Pandas dataframe efficiently
I advise you to check the link.
Solution from DSM :
for k,g in df.groupby(np.arange(len(df))//10):
print(k,g)
Solution from Ryan :
def chunker(seq, size):
return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))
for i in chunker(df,5):
print i