7

I've searched at other questions related to dropping rows but could not find one that worked:

I have a CSV file exported from the tool screaming frog that looks like this:

Internal - HTML |               |             |
--------------- | --------------|-------------|
   Address      |   Content     | Status Code |
----------------|---------------|-------------|
www.example.com |   text/html   |   200       |

I want to remove the first row that contains 'Internal - HTML'. When analyzing it with df.keys() I get this information" Index(['Internal - HTML'], dtype='object').

I want to use the second row as the Index, which contains the correct column labels.

When I use the code:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.drop('Internal - HTML')
a.head(3)

I get this error: KeyError: 'Internal - HTML'

I also tried what was suggested here Remove index name in pandas and also tried resetting the index:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.reset_index(level=0, drop=True)
a.head(3)

None of the options above worked.

martineau
  • 119,623
  • 25
  • 170
  • 301
Robert Padgett
  • 103
  • 2
  • 2
  • 8

2 Answers2

9

You can add header as a parameter in the first call, to use column names and start of data :

a = pandas.read_csv("internal_html.csv", encoding="utf-8", header=1)
PRMoureu
  • 12,817
  • 6
  • 38
  • 48
3

Not exactly sure about how data is in csv, but I think you can use skiprows=1 while reading the csv:

a = pd.read_csv("internal_html.csv", encoding="utf-8")
a.keys()

Output:

Index(['Internal - HTML'], dtype='object')

Looking at df (Assuming data is in following format):

print(a)

Output:

                            Internal - HTML
Address            Content   Status Code   
www.example.com   text/html     200        

Now, using skiprows to read the .csv file:

a = pd.read_csv("internal_html.csv", encoding="utf-8", skiprows=1)
print(a.keys())

Output:

Index(['Address', '   Content', 'Status Code'], dtype='object')

Observing dataframe a:

print(a)

Output:

           Address      Content       Status Code
  0  www.example.com    text/html     200        
niraj
  • 17,498
  • 4
  • 33
  • 48