I've seen similar questions on here but nothing that is quite what I want to do.
I'm reading in a tsv/csv file using
try:
dataframe = pd.read_csv(
filepath_or_buffer=filename_or_obj,
sep='\t',
encoding='utf-8',
skip_blank_lines=True,
error_bad_lines=False,
warn_bad_lines=True,
dtype=data_type_dict,
engine='python',
quoting=csv.QUOTE_NONE
)
except UnicodeDecodeError:
dataframe = pd.read_csv(
filepath_or_buffer=exception_filename_or_obj,
sep='\t',
encoding='latin-1',
skip_blank_lines=True,
error_bad_lines=False,
warn_bad_lines=True,
dtype=data_type_dict,
engine='python',
quoting=csv.QUOTE_NONE
)
I have clearly defined headers within the file but sometimes I see that the file has unexpected additional columns and get the following messages in the console
Skipping line 251643: Expected 20 fields in line 251643, saw 21
This is fine for my process, I would just like to know a way that I can record these messages or lines to either a dataframe or log file so that I know what lines have been skipped. Due to the fact that the files can be submitted by anyone and it's an issue with formatting, I'm not interested in fixing the message, just recording out the line numbers that fail
Massive thanks in advance :)
Edit: include try except clause