I have a few CSV files with the same header.
To optimize my work I merged the files to get one pd.DataFrame:
file1.csv > file_merged.csv
file2.csv | tail -n +2 > file_merged.csv
But during pd.read_csv
I get an error:
228 try:
229 if self.low_memory:
--> 230 chunks = self._reader.read_low_memory(nrows)
231 # destructive to chunks
232 data = _concatenate_chunks(chunks)
~/.local/lib/python3.10/site-packages/pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read_low_memory()
~/.local/lib/python3.10/site-packages/pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()
~/.local/lib/python3.10/site-packages/pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()
~/.local/lib/python3.10/site-packages/pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()
ParserError: Error tokenizing data. C error: Expected 4 fields in line 1391, saw 7
What's the problem? The files can be read separateelly and have the same header (I remembered removed the headers (look: above example)).