Remove top row from a dataframe

Question

I have a dataframe that looks like this:

         level_0              level_1 Repo Averages for 27 Jul 2018
0  Business Date           Instrument                           Ccy
1     27/07/2018  GC_AUSTRIA_SUB_10YR                           EUR
2     27/07/2018    R_RAGB_1.15_10/18                           EUR
3     27/07/2018    R_RAGB_4.35_03/19                           EUR
4     27/07/2018    R_RAGB_1.95_06/19                           EUR

I am trying to get rid of the top row and only keep

   Business Date           Instrument         Ccy
0     27/07/2018  GC_AUSTRIA_SUB_10YR         EUR
1     27/07/2018    R_RAGB_1.15_10/18         EUR
2     27/07/2018    R_RAGB_4.35_03/19         EUR
3     27/07/2018    R_RAGB_1.95_06/19         EUR

I tried df.columns.droplevel(0) but not successful any help is more than welcome

Where are you getting the data from? It looks like an issue in reading the data. — asongtoruin, Jul 31 '18 at 10:39
You are likely to get answers quicker if you have runnable code in your question. — Dov Grobgeld, Jul 31 '18 at 10:40
It is an automated file that has a weird structure. the top row it is like a title. So I have to read in everything and then delete undesirable rows — SBad, Jul 31 '18 at 10:42

Joe · Answer 1 · 2020-08-22T14:48:34.617

7

You can try so:

df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None

Output:

  Business Date           Instrument  Ccy
0    27/07/2018  GC_AUSTRIA_SUB_10YR  EUR
1    27/07/2018    R_RAGB_1.15_10/18  EUR
2    27/07/2018    R_RAGB_4.35_03/19  EUR
3    27/07/2018    R_RAGB_1.95_06/19  EUR

edited Aug 22 '20 at 14:48

answered Jul 31 '18 at 10:50

Joe

12,057
5
39
55

score 6 · Answer 2 · answered Jul 22 '20 at 09:35

You can take advantage of the parameter header (Read here more about the header parameter in pandas).

Let's say that you have the following dataset

df = pd.read_csv("Prices.csv")
print(df)

That outputs

              0       1     2         3         4
0      DATA      SESSAO  HORA  PRECO_PT  PRECO_ES
1      1/1/2020  0       1     41,88     41,88   
2      1/1/2020  0       2     38,60     38,60   
3      1/1/2020  0       3     36,55     36,55

By simply passing the header = 0 like this

df = pd.read_csv("Prices.csv", header=0)
print(df)

You will get what you want

           DATA  SESSAO  HORA PRECO_PT PRECO_ES
0      1/1/2009  0       1     55,01    55,01  
1      1/1/2009  0       2     56,13    56,13  
2      1/1/2009  0       3     50,59    50,59  
3      1/1/2009  0       4     45,83    45,83  
4      1/1/2009  0       5     42,07    41,90

This gives a working solution with a clear explanation AND links to relevant documentation. Thanks! — Steve Whitmore, Mar 13 '23 at 12:00

score 4 · Accepted Answer · answered Aug 07 '20 at 21:29

4

You can try using slicing.

df = df[1:]

This will remove the first row of your dataframe.

answered Aug 07 '20 at 21:29

Zachary Wyman

299
2
11

even if the answer is accepted, have you tested it on the given example? – Joe Aug 22 '20 at 14:48
2

agree with @Joe , this example is not working. – Arun Oct 04 '21 at 23:05

score 1 · Answer 4 · edited Jun 27 '20 at 02:20

1

df.drop(row_start, row_end)

This will help

edited Jun 27 '20 at 02:20

vlizana

2,962
1
16
26

answered Jun 26 '20 at 14:29

Emeka Boris Ama

429
4
5

don't use code snippets if the code is not executable, use code formatting instead. – vlizana Jun 26 '20 at 15:25

score 0 · Answer 5 · answered Sep 17 '22 at 09:49

0

I tested the comment by jeremycg. It works very well and is succinct. Just want more people to see, here it is again -

my_df = pd.read_csv(r"C:\path\to\my\file.csv", skiprows = 1)

answered Sep 17 '22 at 09:49

Egret

421
1
3
13

Remove top row from a dataframe

5 Answers5