I have a large dataframe of 1,150,000 rows and 6 columns.
How do I split the dataframe into 5 dataframes with 200,000 rows each (the last one being 150,000 rows)?
I have a large dataframe of 1,150,000 rows and 6 columns.
How do I split the dataframe into 5 dataframes with 200,000 rows each (the last one being 150,000 rows)?
Use list comprehension to create a list of 6 dataframes which then can be assigned to separate variables.
n = 200000
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]
Outputs:
In [3]: df = pd.DataFrame(index=np.arange(1150000),data=np.random.rand(1150000,6
...: ))
In [4]: n = 200000
In [5]: df1 = [df[i:i+n] for i in range(0,len(df),n)]
In [6]: df1[0].shape
Out[6]: (200000, 6)
In [7]: df1[1].shape
Out[7]: (200000, 6)
In [8]: df1[2].shape
Out[8]: (200000, 6)
In [9]: df1[3].shape
Out[9]: (200000, 6)
In [10]: df1[4].shape
Out[10]: (200000, 6)
In [11]: df1[5].shape
Out[11]: (150000, 6)