2

I have a large dataframe of 1,150,000 rows and 6 columns.

How do I split the dataframe into 5 dataframes with 200,000 rows each (the last one being 150,000 rows)?

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
thomas
  • 33
  • 4
  • Possible duplicate of [How do you split a list into evenly sized chunks in Python?](http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python) – MatsLindh Oct 22 '15 at 17:09

1 Answers1

0

Use list comprehension to create a list of 6 dataframes which then can be assigned to separate variables.

n = 200000
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]

Outputs:

In [3]: df = pd.DataFrame(index=np.arange(1150000),data=np.random.rand(1150000,6
   ...: ))

In [4]: n = 200000

In [5]: df1 = [df[i:i+n] for i in range(0,len(df),n)]
In [6]: df1[0].shape
Out[6]: (200000, 6)

In [7]: df1[1].shape
Out[7]: (200000, 6)

In [8]: df1[2].shape
Out[8]: (200000, 6)

In [9]: df1[3].shape
Out[9]: (200000, 6)

In [10]: df1[4].shape
Out[10]: (200000, 6)

In [11]: df1[5].shape
Out[11]: (150000, 6)
Scott Boston
  • 147,308
  • 15
  • 139
  • 187