I am facing an issue with pandas read_csv. I have a file, which contains " as field value. In reality, that should not be the case, but I have no influence on file generation, due to which I have to find a workaround.
pandas.errors.ParserError: Error tokenizing data. C error: EOF inside string starting at line 15345
I found a issue report on this on Git (link here), where they suggest to use delimiter that is used for "sep" parameter also for "quotechar". In this case, structure of file gets messed up.
Another thing that I did was to add an exception to this, which will run code for rest of the files, but I will keep having that issue for that particular type of files.
Command that I use to read CSV file:
df_new = pd.read_csv(file_path_name, sep=";", error_bad_lines=False)
Any idea of a workaround for this (e.g. ignore line with this issue)? One way I guess would be to use csv library to remove that line (or replace " with something else), but I would like to keep it simple and do as much as possible within pandas.
Python version: 3.6.2
Pandas version: 0.21.0
Thank you and best regards