0

I have a dataframe df :

df
====================================
|            COLUMN_Y              |
====================================
|            value1                |
|            value2                |
|            value3                |
|            value4                |
|            value5                |
|            value6                |
|            value7                |
|            value8                |
|            value9                |
|            value10               |
|            value11               |
|            value12               |
|            value13               |    
|            value14               |
|            value15               |
|            value16               |
====================================

There is no grouping variable that I would like to use to split up this dataframe. I want to split it into multiple dataframes like this, split it every 5 rows. For example, 1002-row dataframes will be splitted to 200 dataframes with 5 rows and 1 dataframe with 2 row . How might I do this?

df1
====================================
|            COLUMN_Y              |
====================================
|            value1                |
|            value2                |
|            value3                |
|            value4                |
|            value5                |
====================================

 df2

====================================
|            COLUMN_Y              |
====================================
|            value6                |
|            value7                |
|            value8                |
|            value9                |
|            value10               |
====================================

df3
====================================
|            COLUMN_Y              |
====================================
|            value11               |
|            value12               |
|            value13               |
|            value14               |
|            value15               |
====================================

df4
====================================
|            COLUMN_Y              |
====================================
|            value16               |
====================================
OwnWork
  • 39
  • 5

3 Answers3

4

Use floor division on the index to create your groups, then we can use DataFrame.groupby to create different dataframes:

grps = df.groupby(df.index // 5)

for _, dfg in grps:
    print(dfg)

  COLUMN_Y
0   value1
1   value2
2   value3
3   value4
4   value5 

  COLUMN_Y
5   value6
6   value7
7   value8
8   value9
9  value10 

   COLUMN_Y
10  value11
11  value12
12  value13
13  value14
14  value15 

   COLUMN_Y
15  value16 
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • I like this answer, may I ask why is it with double `//` instead of one `/` for division?Thanks! Oh wait so it would return an `int` or a whole number? – Ice Bear Dec 19 '20 at 14:06
  • 2
    Sure, `//` is floor division and `/` is a regular division, see [here](https://stackoverflow.com/questions/183853/what-is-the-difference-between-and-when-used-for-division) for more information. [this](https://python-reference.readthedocs.io/en/latest/docs/operators/floor_division.html) link is more concise and straight to the point – Erfan Dec 19 '20 at 14:09
  • Cool! Thanks a lot! – Ice Bear Dec 19 '20 at 14:14
2

The code below will do the split and then save to different CSVs:

split_size = 5
dfs = [df.loc[i:i+split_size-1,:] for i in range(0, len(df),split_size)]
for _, frame in enumerate(dfs):
    frame.to_csv('df'+str(_)+'.csv', index=False)
gtomer
  • 5,643
  • 1
  • 10
  • 21
0

Try a list comprehension:

listofdataframes = [df.iloc[i:i + 5] for i in range(0, len(lst), 5)]
U13-Forward
  • 69,221
  • 14
  • 89
  • 114