pandas read_csv() skiprows=[0] giving issues?

Question

I'm trying to read a csv in pandas. My file starts like:

 Site,Tank ID,Product,Volume,Temperature,Dip Time
   aaa,bbb,....
   .....

I read it with:

df = pd.DataFrame()
    date_col = ['Dip Time']
    data = pd.read_csv(atg_path, delimiter=',', skiprows=[1], skipinitialspace=True,
                                   dayfirst=True,
                                   parse_dates=date_col)

Here it skips the first row data. But I need it.

If I use skiprows=[0], then I get errors on some columns, e.g. ValueError: 'Dip Time' is not in list

I don't know why? It shouldn't skip any of the data. What is wrong?

Do you want to skip reading the **header**, or the **first row of data** (*"aaa,bbb,..."*)? What are you actually trying to achieve with `skiprows=[0]`? Your question is unclear. — smci, Oct 04 '19 at 05:25
`skiprows = 0` (integer) means *"don't skip any rows"*, so it has no effect. Whereas `skiprows = [0]` (list with one element, 0) means *"skip the 0'th row, i.e. the header row"*, so it skips the header (with column names) and reads in the data. — smci, Oct 04 '19 at 05:28
The [`pandas.read_csv()` doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html?highlight=skiprows) explains what `skiprows` does, both as an integer and as a list — smci, Oct 04 '19 at 07:02

jezrael · Accepted Answer · 2019-10-04T05:56:42.107

I think parameter skiprows here is not necessary, you can omit it.

But if pass 0 values it means don't skip any rows:

skiprows=0

import pandas as pd
from io import StringIO

temp="""Site,Tank ID,Product,Volume,Temperature,Dip Time
aaa,bbb,ccc,ddd,eee,fff
a,b,c,d,e,f
"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp))
print (df)
  Site Tank ID Product Volume Temperature Dip Time
0  aaa     bbb     ccc    ddd         eee      fff
1    a       b       c      d           e        f

temp="""Site,Tank ID,Product,Volume,Temperature,Dip Time
aaa,bbb,ccc,ddd,eee,fff
a,b,c,d,e,f
"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), skiprows=0)
print (df)
  Site Tank ID Product Volume Temperature Dip Time
0  aaa     bbb     ccc    ddd         eee      fff
1    a       b       c      d           e        f

But if pass [0] it means remove first row of file, here header, it means "skip the 0'th row, i.e. the headed row:

temp="""Site,Tank ID,Product,Volume,Temperature,Dip Time
aaa,bbb,ccc,ddd,eee,fff
a,b,c,d,e,f
"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), skiprows=[0])
print (df)
  aaa bbb ccc ddd eee fff
0   a   b   c   d   e   f

Thanks I skipped that parameter now works fine. Can you give bit more expalnation why skiprows[0] gave me an issue and why that parameter used for? — Ratha, Oct 04 '19 at 05:15
This isn't an answer. `skiprows = 0` (integer) means *"don't skip any rows"*, so it has no effect. Whereas `skiprows = [0]` (list with one element, 0) means *"skip the 0'th row, i.e. the header row"*, so it skips the header (with column names) and reads in the data. — smci, Oct 04 '19 at 05:27
@smci - Sorry, you are right. Not clearly written, answer was edited. thank you. — jezrael, Oct 04 '19 at 05:57

pandas read_csv() skiprows=[0] giving issues?

1 Answers1