How to insert a header multiple times in a pandas dataframe

Question

I asked this question yesterday but was unclear about a couple things so I'm reposting it here. Basically I have a dataframe of 13 columns and over 500 rows and I'm trying to add a header every x number of rows.

I'm a beginner so I've tried .concat and .append but I'm not sure if I really am doing it right

I have the variable header = ['Rk', 'Player', 'Age',...]

In: print(final.head())

out:
   index            Player Age   Tm Pos  GP   G   A    P +/- PPP    TOI
0      0   Nikita Kucherov  25  TBL  RW  82  41  87  128  24  41  19:58
1      4     Brad Marchand  30  BOS  LW  79  36  64  100  15  33  19:37
2      5     Sidney Crosby  31  PIT   C  79  35  65  100  18  20  21:00
3      6  Nathan MacKinnon  23  COL   C  82  41  58   99  20  31  22:05
4      7   Johnny Gaudreau  25  CGY  LW  82  36  63   99  18  29  20:04

I want to print the header every 48 rows, if I wanted to print it every 2 rows it would look like this:

In: print(final.head())

out:
   index            Player Age   Tm Pos  GP   G   A    P +/- PPP    TOI
0      0   Nikita Kucherov  25  TBL  RW  82  41  87  128  24  41  19:58
1      4     Brad Marchand  30  BOS  LW  79  36  64  100  15  33  19:37
                    Player Age   Tm  Pos GP   G   A    P  +/- PPP   TOI
2      5     Sidney Crosby  31  PIT   C  79  35  65  100  18  20  21:00
3      6  Nathan MacKinnon  23  COL   C  82  41  58   99  20  31  22:05
                    Player Age   Tm  Pos GP   G   A    P  +/- PPP   TOI
4      7   Johnny Gaudreau  25  CGY  LW  82  36  63   99  18  29  20:04

Note I don't really care about the what the value is for the index column for the header row when I insert multiple times, I'm pretty lenient for that part.

Hmmm, what is reason for it? Because only possible add header like rows, so get numeric values mixed with strings. So next processing of numeric data is not possible. — jezrael, May 18 '19 at 15:57
Is this just for display purposes? As jezrael mentioned, mixing data types will not make it possible to do certain operations down the line. — busybear, May 18 '19 at 16:13

jezrael · Answer 1 · 2019-05-24T07:44:14.493

It is possible, but if need working later with data, not recommended, because if mixing numeric values mixed with strings some function should failed:

N = 2
#N = 48 with real data
#get index of added values, omit first value
idx = df.index[::N][1:]
#repeat columns to DataFrame
arr = np.broadcast_to(df.columns, (len(idx),len(df.columns)))
df1 = pd.DataFrame(arr, index=idx, columns=df.columns)

#append original and sorting by index
df = df1.append(df).sort_index().reset_index(drop=True)
print (df)
   index            Player  Age   Tm  Pos  GP   G   A    P  +/-  PPP    TOI
0      0   Nikita Kucherov   25  TBL   RW  82  41  87  128   24   41  19:58
1      4     Brad Marchand   30  BOS   LW  79  36  64  100   15   33  19:37
2  index            Player  Age   Tm  Pos  GP   G   A    P  +/-  PPP    TOI
3      5     Sidney Crosby   31  PIT    C  79  35  65  100   18   20  21:00
4      6  Nathan MacKinnon   23  COL    C  82  41  58   99   20   31  22:05
5  index            Player  Age   Tm  Pos  GP   G   A    P  +/-  PPP    TOI
6      7   Johnny Gaudreau   25  CGY   LW  82  36  63   99   18   29  20:04

EDIT For write each splitted DataFrame to separate sheet in one excel file use:

N = 2
#N = 48 with real data
with pd.ExcelWriter('file.xlsx') as writer:
    for i, df1 in enumerate(np.split(df, range(N, len(df), N))):
        df1.to_excel(writer, sheet_name=f'Sheet{i}', index=False)

EDIT1: For write all DataFrame to same sheetname:

#https://stackoverflow.com/a/33004253  + added index=False to df.to_excel
def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row ,startcol=0, index=False)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

N = 2
#N = 48 with real data
dfs = np.split(df, range(N, len(df), N))
multiple_dfs(dfs, 'Steetname1', 'file.xlsx', 1)

That was pretty close the only thing is my n value for this is going to be 48 (Or at least I think that's what it should be). When I am exporting this dataframe to excel, excel can fit 48 rows per page when printed. I want it so that this row will be the first row on every page. — JackB, May 18 '19 at 18:03
@JackB So you need write each header to separate sheet? And for each excel sheet are 48 rows with header? — jezrael, May 19 '19 at 06:53
Yes pretty much, I want this header to be at the top of each page when I print the dataframe in Excel — JackB, May 20 '19 at 15:03
I think the problem is that when I do that, every time I add that header, I actually need the next one to be one less row away. So the first one needs to be 48 rows away, then the next only 47 then 46 etc... Also I don't want to have the blank new line before the actual header. Thanks again! — JackB, May 28 '19 at 17:28
@JackB - Sorry, but if need change number of rows per each group, the best is post new question. If need no empty row betwen only change `1` to `0` by `multiple_dfs(dfs, 'Steetname1', 'file.xlsx', 0)` — jezrael, May 30 '19 at 11:51

How to insert a header multiple times in a pandas dataframe

1 Answers1

Linked