Replacing Header with Top Row

Question

I currently have a dataframe that looks like this:

       Unnamed: 1    Unnamed: 2   Unnamed: 3  Unnamed: 4
0   Sample Number  Group Number  Sample Name  Group Name
1             1.0           1.0          s_1         g_1
2             2.0           1.0          s_2         g_1
3             3.0           1.0          s_3         g_1
4             4.0           2.0          s_4         g_2

I'm looking for a way to delete the header row and make the first row the new header row, so the new dataframe would look like this:

    Sample Number  Group Number  Sample Name  Group Name
0             1.0           1.0          s_1         g_1
1             2.0           1.0          s_2         g_1
2             3.0           1.0          s_3         g_1
3             4.0           2.0          s_4         g_2

I've tried stuff along the lines of if 'Unnamed' in df.columns: then make the dataframe without the header

df.to_csv(newformat, header=False, index=False)

but I don't seem to be getting anywhere.

score 333 · Accepted Answer · edited Aug 25 '17 at 18:17

333

new_header = df.iloc[0] #grab the first row for the header
df = df[1:] #take the data less the header row
df.columns = new_header #set the header row as the df header

edited Aug 25 '17 at 18:17

Simon

19,658
27
149
217

answered Jul 09 '15 at 21:58

rgalbo

4,186
1
19
29

3

When I do this, the 0 index is also becoming a part of the header. Is there a way to remove the 0 index from my header row? – Pete Mar 25 '22 at 12:14
@Pete What is the output that you get from `df.columns`? – rgalbo Mar 31 '22 at 17:21
1

@Pete, change `new_header = df.iloc[0]` to `new_header = df.iloc[0].tolist()`. This will remove the 0 index as part of the header. – jb12n Mar 01 '23 at 17:25
@jb12n, I just want to say you were spot on. And I will add you can go directly to setting the columns by using df.columns = df.iloc[0].tolist(). As I understand it iloc generated a Series the name of which is the index location. Check by printing the type of new_header. When you make a list you lose the name of the Series. What I find interesting is that the documentation seems to ignore all this (at least I cannot find any). So glad I happened on this thread today. – wiseass Jun 28 '23 at 20:25

score 100 · Answer 2 · edited Oct 03 '18 at 18:58

100

The dataframe can be changed by just doing

df.columns = df.iloc[0]
df = df[1:]

Then

df.to_csv(path, index=False)

Should do the trick.

edited Oct 03 '18 at 18:58

Asclepius

57,944
17
167
143

answered Jul 09 '15 at 21:53

JoeCondron

8,546
3
27
28

9

This is a bettter answer, because there is not redundant code (new_header) in this. – Ad Infinitum Oct 30 '17 at 12:01

score 71 · Answer 3 · answered Nov 23 '17 at 22:00

71

If you want a one-liner, you can do:

df.rename(columns=df.iloc[0]).drop(df.index[0])

answered Nov 23 '17 at 22:00

ostrokach

17,993
11
78
90

3

Make it `df.rename(columns=df.iloc[0]).drop(df.index[0]).reset_index(drop=True)` if you don't want your index to be missing something. – z33k Nov 02 '21 at 12:13
did work for me after making it a two-liner ' df.rename(columns=df.iloc[0, :], inplace=True) df.drop(df.index[0], inplace=True) – Marc Steffen Nov 16 '21 at 22:32

score 19 · Answer 4 · answered Oct 21 '20 at 16:50

19

Another one-liner using Python swapping:

df, df.columns = df[1:] , df.iloc[0]

This won't reset the index

Although, the opposite won't work as expected df.columns, df = df.iloc[0], df[1:]

answered Oct 21 '20 at 16:50

iDataEngX

319
2
5

how does the assignment work? Is df first assigned or df.columns? – Aman Bagrecha Apr 03 '22 at 14:44
1

This answer explains it great https://stackoverflow.com/a/34171485/11808854 – iDataEngX Apr 06 '22 at 13:21

GoPackGo · Answer 5 · 2019-10-18T17:49:44.980

10

@ostrokach answer is best. Most likely you would want to keep that throughout any references to the dataframe, thus would benefit from inplace = True.
df.rename(columns=df.iloc[0], inplace = True) df.drop([0], inplace = True)

edited Oct 18 '19 at 17:49

answered Oct 16 '19 at 17:56

GoPackGo

341
5
9

score 7 · Answer 6 · answered Mar 30 '20 at 03:52

7

Here's a simple trick that defines column indices "in place". Because set_index sets row indices in place, we can do the same thing for columns by transposing the data frame, setting the index, and transposing it back:

df = df.T.set_index(0).T

Note you may have to change the 0 in set_index(0) if your rows have a different index already.

answered Mar 30 '20 at 03:52

Alex P. Miller

2,128
1
23
20

best solution. elegant. – Ayan Mitra Apr 10 '23 at 10:26

score 5 · Answer 7 · answered Sep 03 '21 at 03:02

5

Alternatively, we can do this when reading a file with pandas.

This case we can use,

pd.read_csv('file_path',skiprows=1)

When reading the file this will skip the first row and will set the column as the second row of the file.

answered Sep 03 '21 at 03:02

Ransaka Ravihara

1,786
1
13
30

This does not solve the problem. The values in the second row are not supposed to be the header values. In fact, this is basically the reverse of what the solution should be. If skiprows=-1 would cause the first row to be used as the header, that would be the solution. The accepted solution accomplishes the goal. – Anthony Sep 29 '21 at 14:01

score 2 · Answer 8 · answered Sep 12 '22 at 08:49

2

For some reason, I had to do it this way:

df.columns = [*df.iloc[0]]
df = table[1:]

The part where I'm splitting the list into a list looks redundant, but otherwise, the headers still turn up as part of the actual table.

answered Sep 12 '22 at 08:49

Moritz Gruenwald

21
2

You just missed that table does not exists, it would be 'df' too – Eduardo Rauchbach Nov 06 '22 at 17:39

rra · Answer 9 · 2019-11-13T15:10:56.490

--another way to do this


df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None

    Sample Number  Group Number  Sample Name  Group Name
0             1.0           1.0          s_1         g_1
1             2.0           1.0          s_2         g_1
2             3.0           1.0          s_3         g_1
3             4.0           2.0          s_4         g_2

If you like it hit up arrow. Thanks

score 0 · Answer 10 · answered May 23 '20 at 02:17

0

header = table_df.iloc[0]
table_df.drop([0], axis =0, inplace=True)
table_df.reset_index(drop=True)
table_df.columns = header
table_df

answered May 23 '20 at 02:17

Fazley Rafy

1

Please add an explanation – vsync May 24 '20 at 19:01

Matt_Haythornthwaite · Answer 11 · 2022-11-08T12:40:43.727

This seems like a task that may be needed more than once. I've taken rgalbo's answer and written a simple function that can be lifted and placed into any project.

def promote_df_headers(df):
    '''
    Takes a df and uses the first row as the header

    Parameters
    ----------
    df : DataFrame
        Any df with one or more columns.

    Returns
    -------
    df : DataFrame
        Input df with the first row removed and used as the column names.

    '''

    new_header = df.iloc[0] 
    df = df[1:] 
    df.columns = new_header
    df = df.reset_index(drop=True)

    return df

score 0 · Answer 12 · answered Jul 27 '23 at 20:28

0

If you are starting with a list of lists

pd.DataFrame(input[1:], columns=input[0])

answered Jul 27 '23 at 20:28

rightsized

130
1
8

yanger rai · Answer 13 · 2020-07-31T15:49:35.580

-3

The best practice and Best OneLiner:

df.to_csv(newformat,header=1)

Notice the header value:

Header refer to the Row number(s) to use as the column names. Make no mistake, the row number is not the df but from the excel file(0 is the first row, 1 is the second and so on).

This way, you will get the column name you want and won't have to write additional codes or create new df.

Good thing is, it drops the replaced row.

edited Jul 31 '20 at 15:49

answered Jul 20 '20 at 09:45

yanger rai

102
9

1

This just outputs CSV though, it doesn't change the dataframe, right? – AMC Nov 06 '20 at 03:49
@AMC bit late in replying. To answer you question, Yes. It has no impact on dataframe. – yanger rai Feb 22 '21 at 16:29
So then this answer does not apply to the question and the one-liner doesn't solve the problem. The point isn't to generate a CSV; it's to replace the dataframe's headers with the values in the first row. – Anthony Sep 29 '21 at 13:54
It replace the dataframe headers on pandas. The code itself is "df.", df is your dataframe reading the data from your csv file – yanger rai Jan 27 '22 at 13:48

Replacing Header with Top Row

13 Answers13

Linked

Related