i'll will give my answer in two parts:
part1:
the op asked how to output these bad lines, to answer this we can use python csv module in a simple code like that:
import csv
file = 'your_filename.csv' # use your filename
lines_set = set([100, 200]) # use your bad lines numbers here
with open(file) as f_obj:
for line_number, row in enumerate(csv.reader(f_obj)):
if line_number > max(lines_set):
break
elif line_number in lines_set: # put your bad lines numbers here
print(line_number, row)
also we can put it in more general function like that:
import csv
def read_my_lines(file, lines_list, reader=csv.reader):
lines_set = set(lines_list)
with open(file) as f_obj:
for line_number, row in enumerate(csv.reader(f_obj)):
if line_number > max(lines_set):
break
elif line_number in lines_set:
print(line_number, row)
if __name__ == '__main__':
read_my_lines(file='your_filename.csv', lines_list=[100, 200])
part2: the cause of the error you get:
it's hard to diagnose problem like this without a sample of the file you use.
but you should try this ..
pd.read_csv(filename)
is it parse the file with no error ? if so, i will explain why.
the number of columns is inferred from the first line.
by using skiprows and header=0
you escaped the first 3 rows, i guess that contains the columns names or the header that should contains the correct number of columns.
basically you constraining what the parser is doing.
so parse without skiprows, or header=0
then reindexing to what you need later.
note:
if you unsure about what delimiter used in the file use sep=None
, but it would be slower.
from pandas.read_csv docs:
sep : str, default ‘,’ Delimiter to use. If sep is None, the C engine
cannot automatically detect the separator, but the Python parsing
engine can, meaning the latter will be used and automatically detect
the separator by Python’s builtin sniffer tool, csv.Sniffer. In
addition, separators longer than 1 character and different from '\s+'
will be interpreted as regular expressions and will also force the use
of the Python parsing engine. Note that regex delimiters are prone to
ignoring quoted data. Regex example: '\r\t'
link