How to drop first row using pandas?

Question

I've searched at other questions related to dropping rows but could not find one that worked:

I have a CSV file exported from the tool screaming frog that looks like this:

Internal - HTML |               |             |
--------------- | --------------|-------------|
   Address      |   Content     | Status Code |
----------------|---------------|-------------|
www.example.com |   text/html   |   200       |

I want to remove the first row that contains 'Internal - HTML'. When analyzing it with df.keys() I get this information" Index(['Internal - HTML'], dtype='object').

I want to use the second row as the Index, which contains the correct column labels.

When I use the code:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.drop('Internal - HTML')
a.head(3)

I get this error: KeyError: 'Internal - HTML'

I also tried what was suggested here Remove index name in pandas and also tried resetting the index:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.reset_index(level=0, drop=True)
a.head(3)

None of the options above worked.

score 9 · Accepted Answer · answered Jul 08 '17 at 14:25

9

You can add header as a parameter in the first call, to use column names and start of data :

a = pandas.read_csv("internal_html.csv", encoding="utf-8", header=1)

answered Jul 08 '17 at 14:25

PRMoureu

12,817
6
38
48

score 3 · Answer 2 · answered Jul 08 '17 at 15:45

Not exactly sure about how data is in csv, but I think you can use skiprows=1 while reading the csv:

a = pd.read_csv("internal_html.csv", encoding="utf-8")
a.keys()

Output:

Index(['Internal - HTML'], dtype='object')

Looking at df (Assuming data is in following format):

print(a)

Output:

                            Internal - HTML
Address            Content   Status Code   
www.example.com   text/html     200

Now, using skiprows to read the .csv file:

a = pd.read_csv("internal_html.csv", encoding="utf-8", skiprows=1)
print(a.keys())

Output:

Index(['Address', '   Content', 'Status Code'], dtype='object')

Observing dataframe a:

print(a)

Output:

           Address      Content       Status Code
  0  www.example.com    text/html     200

How to drop first row using pandas?

2 Answers2